Open Access
ARTICLE
Chinese DeepSeek: Performance of Various Oversampling Techniques on Public Perceptions Using Natural Language Processing
1 Artificial Intelligence & Data Analytics Lab, CCIS, Prince Sultan University, Riyadh, 11586, Saudi Arabia
2 Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, 11671, Saudi Arabia
* Corresponding Author: Amal Al-Rasheed. Email:
(This article belongs to the Special Issue: Advancements and Challenges in Artificial Intelligence, Data Analysis and Big Data)
Computers, Materials & Continua 2025, 84(2), 2717-2731. https://doi.org/10.32604/cmc.2025.065566
Received 16 March 2025; Accepted 28 April 2025; Issue published 03 July 2025
Abstract
DeepSeek Chinese artificial intelligence (AI) open-source model, has gained a lot of attention due to its economical training and efficient inference. DeepSeek, a model trained on large-scale reinforcement learning without supervised fine-tuning as a preliminary step, demonstrates remarkable reasoning capabilities of performing a wide range of tasks. DeepSeek is a prominent AI-driven chatbot that assists individuals in learning and enhances responses by generating insightful solutions to inquiries. Users possess divergent viewpoints regarding advanced models like DeepSeek, posting both their merits and shortcomings across several social media platforms. This research presents a new framework for predicting public sentiment to evaluate perceptions of DeepSeek. To transform the unstructured data into a suitable manner, we initially collect DeepSeek-related tweets from Twitter and subsequently implement various preprocessing methods. Subsequently, we annotated the tweets utilizing the Valence Aware Dictionary and sentiment Reasoning (VADER) methodology and the lexicon-driven TextBlob. Next, we classified the attitudes obtained from the purified data utilizing the proposed hybrid model. The proposed hybrid model consists of long-term, short-term memory (LSTM) and bidirectional gated recurrent units (BiGRU). To strengthen it, we include multi-head attention, regularizer activation, and dropout units to enhance performance. Topic modeling employing KMeans clustering and Latent Dirichlet Allocation (LDA), was utilized to analyze public behavior concerning DeepSeek. The perceptions demonstrate that 82.5% of the people are positive, 15.2% negative, and 2.3% neutral using TextBlob, and 82.8% positive, 16.1% negative, and 1.2% neutral using the VADER analysis. The slight difference in results ensures that both analyses concur with their overall perceptions and may have distinct views of language peculiarities. The results indicate that the proposed model surpassed previous state-of-the-art approaches.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.