A Survey on Hallucination in Large Language Models: Definitions, Detection, and Mitigation
Abstract
Hallucination, the generation of plausible yet factually incorrect content, remains a critical barrier to the reliable deployment of Large Language Models (LLMs). This review synthesizes the state-of-the-art in understanding, detecting, and mitigating LLM hallucinations. We begin by establishing a clear taxonomy, tracing the concept's evolution from a broad notion of factual error to a more precise definition centered on unfaithfulness to a model's accessible knowledge. We then survey detection methodologies, categorizing them by model access requirements and examining techniques such as uncertainty estimation, consistency checking, and knowledge grounding. Finally, we provide a structured overview of mitigation strategies organized by their application across the model lifecycle: (1) data-centric approaches like high-quality curation, (2) model-centric alignment through preference optimization and knowledge editing, and (3) inference-time techniques such as Retrieval-Augmented Generation (RAG) and self-correction. We conclude that a layered, "defense-in-depth" strategy is essential for robust mitigation. Key open challenges are scalable data curation, the alignment-capability trade-off, and editing reasoning paths over facts
Related articles
Related articles are currently not available for this article.