Open Access
ARTICLE
Sentiment Analysis System in Big Data Environment
Wint Nyein Chan1, Thandar Thein2
1 University of Computer Studies, Yangon
2 University of Computer Studies, Maubin
E-mail: wintnyeinchan2012@gmail.com; thandartheinn@gmail.com
Computer Systems Science and Engineering 2018, 33(3), 187-202. https://doi.org/10.32604/csse.2018.33.187
Abstract
Nowadays, Big Data, a large volume of both structured and unstructured data, is generated from Social Media. Social Media are powerful marketing tools
and social big data can offer the business insights. The major challenge facing social big data is attaining efficient techniques to collect a large volume of
social data and extract insights from the huge amount of collected data. Sentiment Analysis of social big data can provide business insights by extracting
the public opinions. The traditional analytic platforms need to be scaled up for analyzing a large volume of social big data. Social data are by nature
shorter and generally not constructed with proper grammatical rules and hence difficult to achieve high reliable result in Sentiment Analysis. Acquiring
effective training data is a challenge, although learning based approaches are good for sentiment classification. Manual Labeling for training data is time and
labor consuming. In this paper, Sentiment Analysis system on Big Data Analytics platform is proposed to provide valuable information by analyzing large
scale social data in an efficient and timely manner since they have been implemented using a MapReduce framework and a Hadoop distributed storage
(HDFS). The proposed Sentiment Analysis system consists of four modules: data collection, data cleaning and preprocessing, class labeling and sentiment
classification. The system enables high-level performance of sentiment classification while taking advantage of combining lexicon-based classifier’s effortless
setup process and learning based classifier. Twitter stream data is used for system evaluation as the Twitter is widespread Social Media and a good source of
information in the sense of snapshots of moods and feelings as well as up-to-date events. The evaluation results show that this system achieve a promising
accuracy by 84.2%. Moreover, this system is able to scale up to analyze the large scale data by decreasing the processing time when adding more nodes
in the cluster.
Keywords
Cite This Article
W. Nyein Chan and T. Thein, "Sentiment analysis system in big data environment,"
Computer Systems Science and Engineering, vol. 33, no.3, pp. 187–202, 2018. https://doi.org/10.32604/csse.2018.33.187