Open Access iconOpen Access

ARTICLE

crossmark

Enhancing Multi-Class Cyberbullying Classification with Hybrid Feature Extraction and Transformer-Based Models

Suliman Mohamed Fati1,*, Mohammed A. Mahdi2, Mohamed A.G. Hazber2, Shahanawaj Ahamad3, Sawsan A. Saad4, Mohammed Gamal Ragab5, Mohammed Al-Shalabi2

1 Information Systems Department, College of Computer and Information Sciences, Prince Sultan University, Riyadh, 11586, Saudi Arabia
2 Information and Computer Science Department, College of Computer Science and Engineering, University of Ha’il, Ha’il, 55476, Saudi Arabia
3 Software Engineering Department, College of Computer Science and Engineering, University of Ha’il, Ha’il, 55476, Saudi Arabia
4 Computer Engineering Department, College of Computer Science and Engineering, University of Ha’il, Ha’il, 55476, Saudi Arabia
5 Department of Computer and Information Sciences, Universiti Teknologi Petronas, Seri Iskandar, 32610, Malaysia

* Corresponding Author: Suliman Mohamed Fati. Email: email

(This article belongs to the Special Issue: Emerging Artificial Intelligence Technologies and Applications)

Computer Modeling in Engineering & Sciences 2025, 143(2), 2109-2131. https://doi.org/10.32604/cmes.2025.063092

Abstract

Cyberbullying on social media poses significant psychological risks, yet most detection systems oversimplify the task by focusing on binary classification, ignoring nuanced categories like passive-aggressive remarks or indirect slurs. To address this gap, we propose a hybrid framework combining Term Frequency-Inverse Document Frequency (TF-IDF), word-to-vector (Word2Vec), and Bidirectional Encoder Representations from Transformers (BERT) based models for multi-class cyberbullying detection. Our approach integrates TF-IDF for lexical specificity and Word2Vec for semantic relationships, fused with BERT’s contextual embeddings to capture syntactic and semantic complexities. We evaluate the framework on a publicly available dataset of 47,000 annotated social media posts across five cyberbullying categories: age, ethnicity, gender, religion, and indirect aggression. Among BERT variants tested, BERT Base Un-Cased achieved the highest performance with 93% accuracy (standard deviation ±1% across 5-fold cross-validation) and an average AUC of 0.96, outperforming standalone TF-IDF (78%) and Word2Vec (82%) models. Notably, it achieved near-perfect AUC scores (0.99) for age and ethnicity-based bullying. A comparative analysis with state-of-the-art benchmarks, including Generative Pre-trained Transformer 2 (GPT-2) and Text-to-Text Transfer Transformer (T5) models highlights BERT’s superiority in handling ambiguous language. This work advances cyberbullying detection by demonstrating how hybrid feature extraction and transformer models improve multi-class classification, offering a scalable solution for moderating nuanced harmful content.

Keywords

Cyberbullying classification; multi-class classification; BERT models; machine learning; TF-IDF; Word2Vec; social media analysis; transformer models

Cite This Article

APA Style
Fati, S.M., Mahdi, M.A., Hazber, M.A., Ahamad, S., Saad, S.A. et al. (2025). Enhancing Multi-Class Cyberbullying Classification with Hybrid Feature Extraction and Transformer-Based Models. Computer Modeling in Engineering & Sciences, 143(2), 2109–2131. https://doi.org/10.32604/cmes.2025.063092
Vancouver Style
Fati SM, Mahdi MA, Hazber MA, Ahamad S, Saad SA, Ragab MG, et al. Enhancing Multi-Class Cyberbullying Classification with Hybrid Feature Extraction and Transformer-Based Models. Comput Model Eng Sci. 2025;143(2):2109–2131. https://doi.org/10.32604/cmes.2025.063092
IEEE Style
S. M. Fati et al., “Enhancing Multi-Class Cyberbullying Classification with Hybrid Feature Extraction and Transformer-Based Models,” Comput. Model. Eng. Sci., vol. 143, no. 2, pp. 2109–2131, 2025. https://doi.org/10.32604/cmes.2025.063092



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 434

    View

  • 217

    Download

  • 0

    Like

Share Link