Open Access iconOpen Access

ARTICLE

crossmark

Short Text Entity Disambiguation Algorithm Based on Multi-Word Vector Ensemble

Qin Zhang1, Xuyu Xiang1,*, Jiaohua Qin1, Yun Tan1, Qiang Liu1, Neal N. Xiong2

1 College of Computer Science and Information Technology, Central South University of Forestry and Technology, Changsha, 410004, China
2 Department of mathematics and computer science, Northeastern State University, OK, 74464, USA

* Corresponding Author: Xuyu Xiang. Email: email

Intelligent Automation & Soft Computing 2021, 30(1), 227-241. https://doi.org/10.32604/iasc.2021.017648

Abstract

With the rapid development of network media, the short text has become the main cover of information dissemination by quickly disseminating relevant entity information. However, the lack of context in the short text can easily lead to ambiguity, which will greatly reduce the efficiency of obtaining information and seriously affect the user’s experience, especially in the financial field. This paper proposed an entity disambiguation algorithm based on multi-word vector ensemble and decision to eliminate the ambiguity of entities and purify text information in information processing. First of all, we integrate a variety of unsupervised pre-trained word vector models as vector embeddings according to different word vector models’ characteristics. Moreover, we use the classic architecture of long short-term memory (LSTM) combined with the convolutional neural network (CNN) to fine-tune pre-trained Chinese word vectors such as BERT to integrate the output of entity recognition results. Then build the knowledge base and introduce the focal loss function on the basis of CNN and binary classification to improve the effect of entity disambiguation. Experimental results show that the algorithm performs better than the traditional entity disambiguation algorithm based on the single word vector. This method can accurately locate the entity to be disambiguated and has a good performance in disambiguation accuracy.

Keywords


Cite This Article

Q. Zhang, X. Xiang, J. Qin, Y. Tan, Q. Liu et al., "Short text entity disambiguation algorithm based on multi-word vector ensemble," Intelligent Automation & Soft Computing, vol. 30, no.1, pp. 227–241, 2021. https://doi.org/10.32604/iasc.2021.017648



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1543

    View

  • 949

    Download

  • 0

    Like

Share Link