Home / Journals / CMC / Online First / doi:10.32604/cmc.2024.050585
Special lssues
Table of Content

Open Access

ARTICLE

Relational Turkish Text Classification Using Distant Supervised Entities and Relations

Halil Ibrahim Okur1,2,*, Kadir Tohma1, Ahmet Sertbas2
1 Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Iskenderun Technical University, Hatay, 31200, Turkey
2 Department of Computer Engineering, Faculty of Engineering, Istanbul University-Cerrahpasa, Istanbul, 34310, Turkey
* Corresponding Author: Halil Ibrahim Okur. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2024.050585

Received 10 February 2024; Accepted 21 March 2024; Published online 24 April 2024

Abstract

Text classification, by automatically categorizing texts, is one of the foundational elements of natural language processing applications. This study investigates how text classification performance can be improved through the integration of entity-relation information obtained from the Wikidata (Wikipedia database) database and BERT-based pre-trained Named Entity Recognition (NER) models. Focusing on a significant challenge in the field of natural language processing (NLP), the research evaluates the potential of using entity and relational information to extract deeper meaning from texts. The adopted methodology encompasses a comprehensive approach that includes text preprocessing, entity detection, and the integration of relational information. Experiments conducted on text datasets in both Turkish and English assess the performance of various classification algorithms, such as Support Vector Machine, Logistic Regression, Deep Neural Network, and Convolutional Neural Network. The results indicate that the integration of entity-relation information can significantly enhance algorithm performance in text classification tasks and offer new perspectives for information extraction and semantic analysis in NLP applications. Contributions of this work include the utilization of distant supervised entity-relation information in Turkish text classification, the development of a Turkish relational text classification approach, and the creation of a relational database. By demonstrating potential performance improvements through the integration of distant supervised entity-relation information into Turkish text classification, this research aims to support the effectiveness of text-based artificial intelligence (AI) tools. Additionally, it makes significant contributions to the development of multilingual text classification systems by adding deeper meaning to text content, thereby providing a valuable addition to current NLP studies and setting an important reference point for future research.

Keywords

Text classification; relation extraction; NER; distant supervision; deep learning; machine learning
  • 103

    View

  • 15

    Download

  • 0

    Like

Share Link