Open Access iconOpen Access

ARTICLE

crossmark

Multi-Class Sentiment Analysis of Social Media Data with Machine Learning Algorithms

Galimkair Mutanov, Vladislav Karyukin*, Zhanl Mamykova

Al-Farabi Kazakh National University, Almaty, 050040, Kazakhstan

* Corresponding Author: Vladislav Karyukin. Email: email

Computers, Materials & Continua 2021, 69(1), 913-930. https://doi.org/10.32604/cmc.2021.017827

Abstract

The volume of social media data on the Internet is constantly growing. This has created a substantial research field for data analysts. The diversity of articles, posts, and comments on news websites and social networks astonishes imagination. Nevertheless, most researchers focus on posts on Twitter that have a specific format and length restriction. The majority of them are written in the English language. As relatively few works have paid attention to sentiment analysis in the Russian and Kazakh languages, this article thoroughly analyzes news posts in the Kazakhstan media space. The amassed datasets include texts labeled according to three sentiment classes: positive, negative, and neutral. The datasets are highly imbalanced, with a significant predominance of the positive class. Three resampling techniques (undersampling, oversampling, and synthetic minority oversampling (SMOTE)) are used to resample the datasets to deal with this issue. Subsequently, the texts are vectorized with the TF-IDF metric and classified with seven machine learning (ML) algorithms: naïve Bayes, support vector machine, logistic regression, k-nearest neighbors, decision tree, random forest, and XGBoost. Experimental results reveal that oversampling and SMOTE with logistic regression, decision tree, and random forest achieve the best classification scores. These models are effectively employed in the developed social analytics platform.

Keywords


Cite This Article

APA Style
Mutanov, G., Karyukin, V., Mamykova, Z. (2021). Multi-class sentiment analysis of social media data with machine learning algorithms. Computers, Materials & Continua, 69(1), 913-930. https://doi.org/10.32604/cmc.2021.017827
Vancouver Style
Mutanov G, Karyukin V, Mamykova Z. Multi-class sentiment analysis of social media data with machine learning algorithms. Comput Mater Contin. 2021;69(1):913-930 https://doi.org/10.32604/cmc.2021.017827
IEEE Style
G. Mutanov, V. Karyukin, and Z. Mamykova "Multi-Class Sentiment Analysis of Social Media Data with Machine Learning Algorithms," Comput. Mater. Contin., vol. 69, no. 1, pp. 913-930. 2021. https://doi.org/10.32604/cmc.2021.017827

Citations




cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 3143

    View

  • 1730

    Download

  • 0

    Like

Share Link