Yubin Qu1,2, Song Huang2,*, Long Li3, Peng Nie2, Yongming Yao2
CMC-Computers, Materials & Continua, Vol.85, No.1, pp. 249-300, 2025, DOI:10.32604/cmc.2025.067750
- 29 August 2025
Abstract Large language models (LLMs) represent significant advancements in artificial intelligence. However, their increasing capabilities come with a serious challenge: misalignment, which refers to the deviation of model behavior from the designers’ intentions and human values. This review aims to synthesize the current understanding of the LLM misalignment issue and provide researchers and practitioners with a comprehensive overview. We define the concept of misalignment and elaborate on its various manifestations, including generating harmful content, factual errors (hallucinations), propagating biases, failing to follow instructions, emerging deceptive behaviors, and emergent misalignment. We explore the multifaceted causes of misalignment,… More >