Open AccessOpen Access


Semantic Analysis of Urdu English Tweets Empowered by Machine Learning

Nadia Tabassum1, Tahir Alyas2, Muhammad Hamid3,*, Muhammad Saleem4, Saadia Malik5, Zain Ali2, Umer Farooq2

1 Department of Computer Science, Virtual University of Pakistan, Lahore, 54000, Pakistan
2 Department of Computer Science, Lahore Garrison University, Lahore, 54000, Pakistan
3 Department of Statistics and Computer science, University of veterinary and animal sciences, Lahore, 54000, Pakistan
4 Department of Industrial Engineering, Faculty of Engineering, Rabigh, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
5 Department of Information Systems, Faculty of Computing and Information Technology - Rabigh, King Abdulaziz University, Jeddah, 21589, Saudi Arabia

* Corresponding Author: Muhammad Hamid. Email:

Intelligent Automation & Soft Computing 2021, 30(1), 175-186.


Development in the field of opinion mining and sentiment analysis has been rapid and aims to explore views or texts on various social media sites through machine-learning techniques with the sentiment, subjectivity analysis and calculations of polarity. Sentiment analysis is a natural language processing strategy used to decide if the information is positive, negative, or neutral and it is frequently performed on literature information to help organizations screen brand, item sentiment in client input, and comprehend client needs. In this paper, two strategies for sentiment analysis is proposed for word embedding and a bag of words on Urdu and English tweets. Word embedding is a notable arrangement of procedures that can remember words linguistics dependent on the spread theory which expresses that word is utilized and happens within the same settings tend to indicate comparable implications. Bag of words is an approach used in natural language processing to retrieve information and features from written documents. For the bag of words, machine learning techniques like naive bayes, decision tree, k-nearest neighbor, and support vector machine is used to enhance the accuracy. For word embedding the neural network technique is proposed by the combination of recurrent neural network (RNN) with long-short term memory (LSTM) for sentimental analysis of tweets. Datasets of Urdu and English tweets are used for negative and positive classification tweets with machine learning techniques. The contribution of this paper involves the implementation of a hybrid approach that focused on a sentiment analyzer to overcome social network challenges and also provided the comparative analysis of different machine learning algorithms. The results indicate improvement while using the combination of RNN with the help of LSTM showed accuracy 87% on the Urdu dataset and 92% on the English dataset.


Cite This Article

N. Tabassum, T. Alyas, M. Hamid, M. Saleem, S. Malik et al., "Semantic analysis of urdu english tweets empowered by machine learning," Intelligent Automation & Soft Computing, vol. 30, no.1, pp. 175–186, 2021.


This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1304


  • 739


  • 1


Share Link

WeChat scan