Open Access iconOpen Access

ARTICLE

crossmark

News Modeling and Retrieving Information: Data-Driven Approach

Elias Hossain1, Abdullah Alshahrani2, Wahidur Rahman3,*

1 Electrical & Computer Engineering, North South University, Dhaka, 1229, Bangladesh
2 Department of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah, 21493, Saudi Arabia
3 Department of Computer Science and Engineering, Uttara University, Dhaka, 1230, Bangladesh

* Corresponding Author: Wahidur Rahman. Email: email

(This article belongs to this Special Issue: Data Analytics for Business Intelligence: Trends and Applications)

Intelligent Automation & Soft Computing 2023, 38(2), 109-123. https://doi.org/10.32604/iasc.2022.029511

Abstract

This paper aims to develop Machine Learning algorithms to classify electronic articles related to this phenomenon by retrieving information and topic modelling. The Methodology of this study is categorized into three phases: the Text Classification Approach (TCA), the Proposed Algorithms Interpretation (PAI), and finally, Information Retrieval Approach (IRA). The TCA reflects the text preprocessing pipeline called a clean corpus. The Global Vectors for Word Representation (Glove) pre-trained model, FastText, Term Frequency-Inverse Document Frequency (TF-IDF), and Bag-of-Words (BOW) for extracting the features have been interpreted in this research. The PAI manifests the Bidirectional Long Short-Term Memory (Bi-LSTM) and Convolutional Neural Network (CNN) to classify the COVID-19 news. Again, the IRA explains the mathematical interpretation of Latent Dirichlet Allocation (LDA), obtained for modelling the topic of Information Retrieval (IR). In this study, 99% accuracy was obtained by performing K-fold cross-validation on Bi-LSTM with Glove. A comparative analysis between Deep Learning and Machine Learning based on feature extraction and computational complexity exploration has been performed in this research. Furthermore, some text analyses and the most influential aspects of each document have been explored in this study. We have utilized Bidirectional Encoder Representations from Transformers (BERT) as a Deep Learning mechanism in our model training, but the result has not been uncovered satisfactory. However, the proposed system can be adjustable in the real-time news classification of COVID-19.

Keywords


Cite This Article

E. Hossain, A. Alshahrani and W. Rahman, "News modeling and retrieving information: data-driven approach," Intelligent Automation & Soft Computing, vol. 38, no.2, pp. 109–123, 2023.



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 270

    View

  • 77

    Download

  • 0

    Like

Share Link