Open Access iconOpen Access

ARTICLE

crossmark

Enhancing Arabic Sentiment Analysis with Pre-Trained CAMeLBERT: A Case Study on Noisy Texts

Fay Aljomah, Lama Aldhafeeri, Maha Alfadel, Sultanh Alshahrani, Qaisar Abbas*, Sarah Alhumoud*

College of Computer and Information Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, 11432, Saudi Arabia

* Corresponding Authors: Qaisar Abbas. Email: email; Sarah Alhumoud. Email: email

Computers, Materials & Continua 2025, 84(3), 5317-5335. https://doi.org/10.32604/cmc.2025.062478

Abstract

Dialectal Arabic text classification (DA-TC) provides a mechanism for performing sentiment analysis on recent Arabic social media leading to many challenges owing to the natural morphology of the Arabic language and its wide range of dialect variations. The availability of annotated datasets is limited, and preprocessing of the noisy content is even more challenging, sometimes resulting in the removal of important cues of sentiment from the input. To overcome such problems, this study investigates the applicability of using transfer learning based on pre-trained transformer models to classify sentiment in Arabic texts with high accuracy. Specifically, it uses the CAMeLBERT model finetuned for the Multi-Domain Arabic Resources for Sentiment Analysis (MARSA) dataset containing more than 56,000 manually annotated tweets annotated across political, social, sports, and technology domains. The proposed method avoids extensive use of preprocessing and shows that raw data provides better results because they tend to retain more linguistic features. The fine-tuned CAMeLBERT model produces state-of-the-art accuracy of 92%, precision of 91.7%, recall of 92.3%, and F1-score of 91.5%, outperforming standard machine learning models and ensemble-based/deep learning techniques. Our performance comparisons against other pre-trained models, namely AraBERTv02-twitter and MARBERT, show that transformer-based architectures are consistently the best suited when dealing with noisy Arabic texts. This work leads to a strong remedy for the problems in Arabic sentiment analysis and provides recommendations on easy tuning of the pre-trained models to adapt to challenging linguistic features and domain-specific tasks.

Keywords

Artificial intelligence; deep learning; machine learning; BERT; CAMeLBERT; natural language processing; sentiment analysis; transformer

Cite This Article

APA Style
Aljomah, F., Aldhafeeri, L., Alfadel, M., Alshahrani, S., Abbas, Q. et al. (2025). Enhancing Arabic Sentiment Analysis with Pre-Trained CAMeLBERT: A Case Study on Noisy Texts. Computers, Materials & Continua, 84(3), 5317–5335. https://doi.org/10.32604/cmc.2025.062478
Vancouver Style
Aljomah F, Aldhafeeri L, Alfadel M, Alshahrani S, Abbas Q, Alhumoud S. Enhancing Arabic Sentiment Analysis with Pre-Trained CAMeLBERT: A Case Study on Noisy Texts. Comput Mater Contin. 2025;84(3):5317–5335. https://doi.org/10.32604/cmc.2025.062478
IEEE Style
F. Aljomah, L. Aldhafeeri, M. Alfadel, S. Alshahrani, Q. Abbas, and S. Alhumoud, “Enhancing Arabic Sentiment Analysis with Pre-Trained CAMeLBERT: A Case Study on Noisy Texts,” Comput. Mater. Contin., vol. 84, no. 3, pp. 5317–5335, 2025. https://doi.org/10.32604/cmc.2025.062478



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1831

    View

  • 689

    Download

  • 0

    Like

Share Link