Open Access iconOpen Access



Social Media and Stock Market Prediction: A Big Data Approach

Mazhar Javed Awan1,2,*, Mohd Shafry Mohd Rahim2, Haitham Nobanee3,4,5, Ashna Munawar2, Awais Yasin6, Azlan Mohd Zain 7

1 School of Computing, Faculty of Engineering, University Teknologi Malaysia, Johor, Malaysia
2 Department of Software Engineering, University of Management and Technology, Lahore, Pakistan
3 Collage of Business, Abu Dhabi University, Abu Dhabi, United Arab Emirates
4 Oxford Center for Islamic Studies, The University of Oxford, Oxford, UK
5 The University of Liverpool Management School, The University of Liverpool, Liverpool, UK
6 Department of Computer Engineering, National University of Technology, Islamabad, Pakistan
7 School of Computing, UTM Big Data Centre, Universiti Teknologi Malaysia, Johor, Malaysia

* Corresponding Author: Mazhar Javed Awan. Email: email

(This article belongs to the Special Issue: Artificial Intelligence and Big Data in Entrepreneurship)

Computers, Materials & Continua 2021, 67(2), 2569-2583.


Big data is the collection of large datasets from traditional and digital sources to identify trends and patterns. The quantity and variety of computer data are growing exponentially for many reasons. For example, retailers are building vast databases of customer sales activity. Organizations are working on logistics financial services, and public social media are sharing a vast quantity of sentiments related to sales price and products. Challenges of big data include volume and variety in both structured and unstructured data. In this paper, we implemented several machine learning models through Spark MLlib using PySpark, which is scalable, fast, easily integrated with other tools, and has better performance than the traditional models. We studied the stocks of 10 top companies, whose data include historical stock prices, with MLlib models such as linear regression, generalized linear regression, random forest, and decision tree. We implemented naive Bayes and logistic regression classification models. Experimental results suggest that linear regression, random forest, and generalized linear regression provide an accuracy of 80%–98%. The experimental results of the decision tree did not well predict share price movements in the stock market.


Cite This Article

APA Style
Awan, M.J., Rahim, M.S.M., Nobanee, H., Munawar, A., Yasin, A. et al. (2021). Social media and stock market prediction: A big data approach. Computers, Materials & Continua, 67(2), 2569-2583.
Vancouver Style
Awan MJ, Rahim MSM, Nobanee H, Munawar A, Yasin A, AMZ. Social media and stock market prediction: A big data approach. Comput Mater Contin. 2021;67(2):2569-2583
IEEE Style
M.J. Awan, M.S.M. Rahim, H. Nobanee, A. Munawar, A. Yasin, and A.M.Z. "Social Media and Stock Market Prediction: A Big Data Approach," Comput. Mater. Contin., vol. 67, no. 2, pp. 2569-2583. 2021.


cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 6458


  • 2887


  • 1


Share Link