From Lexicons to Large Language Models: A Comprehensive Survey of Sentiment Analysis Methods, Benchmarks, and Emerging Frontiers
Shuvodeep De1,*, Agnivo Gosai2,#, Karun Thankachan3,#, Ramadan A. ZeinEldin4, Abdulaziz T. Almaktoom5, Mustafa Bayram6, Ali Wagdy Mohamed7,8,*
1 Ingram School of Engineering, Texas State University, San Marcos, TX, USA
2 Corning Incorporated, Painted Post, NY, USA
3 Language Technologies Institute, School of Computer Science (SCS), Carnegie Mellon University, Pittsburgh, PA, USA
4 Deanship of Scientific Research, King Abdulaziz University, Jeddah, Saudi Arabia
5 Department of Operations and Supply Chain Management, Effat University, Jeddah, Saudi Arabia
6 Department of Computer Engineering, Biruni University, Istanbul, Turkey
7 Operations Research Department, Faculty of Graduate Studies for Statistical Research, Cairo University, Giza, Egypt
8 School of Business, University of Science and Technology, Zewail City of Science and Technology, 6th of October City, Giza, Egypt
* Corresponding Authors: Shuvodeep De. Email: vvg26@txstate.edu; Ali Wagdy Mohamed. Email: aliwagdy@gmail.com
# These authors contributed equally to this work
Computer Modeling in Engineering & Sciences https://doi.org/10.32604/cmes.2026.080601
Received 12 February 2026; Accepted 15 April 2026; Published online 07 May 2026
Abstract
Sentiment analysis (SA) has evolved from a niche text-classification task into a central problem in natural language processing, spanning multiple domains, modalities, and languages. This survey provides a comprehensive review of sentiment analysis methods from their origins in lexicon-based approaches through classical machine learning, deep learning architectures, pre-trained transformers, and the current era of large language models (LLMs). We formalize the SA problem across multiple granularity levels (document, sentence, and aspect) and present a taxonomy that encompasses classification, regression, aspect-based sentiment analysis (ABSA), emotion detection, and stance detection tasks across diverse domains including movie reviews, product reviews, healthcare, finance, and social media. We review benchmark datasets spanning text-only corpora (IMDb, SST, SemEval series), multimodal benchmarks (CMU-MOSI, CMU-MOSEI, MELD), and domain-specific evaluation suites such as SentiEval. The methodological evolution is traced from VADER and SentiWordNet, through SVM and Naïve Bayes classifiers, CNN and LSTM architectures, BERT and its variants, to modern LLMs including GPT-4, Llama 3, and ModernBERT, with technical details of key architectures and their mathematical formulations. We provide dedicated analyses of chain-of-thought reasoning for implicit sentiment, multimodal fusion strategies, cross-lingual transfer methods, sarcasm and irony detection, explainability through SHAP and LIME, and the emerging challenge of AI-generated fake reviews. A comparative analysis across paradigms reveals that while LLMs achieve strong zero-shot performance, fine-tuned smaller models remain competitive on standard benchmarks, a finding with significant implications for deployment efficiency. We identify persistent open challenges including domain drift, cultural bias, and the model variability problem, and outline future research directions encompassing reasoning-augmented SA, agentic workflows, federated learning, and real-time edge deployment. With coverage of over 130 references spanning two decades of research and 29 new references from 2024 and 2025, this survey provides a unified roadmap for both newcomers and researchers at the frontier of sentiment analysis.
Keywords
Sentiment analysis; opinion mining; large language models; transformers; BERT; aspect-based sentiment analysis; multimodal sentiment analysis; cross-lingual NLP; explainable AI; chain-of-thought reasoning; sarcasm detection; benchmark datasets