Home / Journals / CMC / Online First / doi:10.32604/cmc.2024.050005
Special lssues
Table of Content

Open Access

ARTICLE

Enhancing Relational Triple Extraction in Specific Domains: Semantic Enhancement and Synergy of Large Language Models and Small Pre-Trained Language Models

Jiakai Li, Jianpeng Hu*, Geng Zhang
School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, China
* Corresponding Author: Jianpeng Hu. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2024.050005

Received 24 January 2024; Accepted 27 March 2024; Published online 26 April 2024

Abstract

In the process of constructing domain-specific knowledge graphs, the task of relational triple extraction plays a critical role in transforming unstructured text into structured information. Existing relational triple extraction models face multiple challenges when processing domain-specific data, including insufficient utilization of semantic interaction information between entities and relations, difficulties in handling challenging samples, and the scarcity of domain-specific datasets. To address these issues, our study introduces three innovative components: Relation semantic enhancement, data augmentation, and a voting strategy, all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks. We first propose an innovative attention interaction module. This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information from relation labels. Second, we propose a voting strategy that effectively combines the strengths of large language models (LLMs) and fine-tuned small pre-trained language models (SLMs) to reevaluate challenging samples, thereby improving the model’s adaptability in specific domains. Additionally, we explore the use of LLMs for data augmentation, aiming to generate domain-specific datasets to alleviate the scarcity of domain data. Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects, with F1 scores exceeding the State of the Art models by 2%, 1.6%, and 0.6%, respectively, validating the effectiveness and generalizability of our approach.

Keywords

Relational triple extraction; semantic interaction; large language models; data augmentation; specific domains
  • 88

    View

  • 9

    Download

  • 0

    Like

Share Link