Paragraph Vector Representation Based on Word to Vector and CNN Learning

Zeyu Xiong; Qiangqiang Shen; Yijie Wang; Chenyang Zhu

doi:10.3970/cmc.2018.01762

Open Access icon Open Access

ARTICLE

Paragraph Vector Representation Based on Word to Vector and CNN Learning

Zeyu Xiong^1,*, Qiangqiang Shen¹, Yijie Wang¹, Chenyang Zhu²

1 College of Computer, National University of Defense Technology, De Ya Road, Changsha 410073, China.
2 School of Computer Science, Simon Fraser University, 8888 UNIVERSITY DRIVE, BURNABY, BC, V5A 1S6, Vancouver, Canada.

* Corresponding author: Zeyu Xiong. Email: email .

Computers, Materials & Continua 2018, 55(2), 213-227. https://doi.org/10.3970/cmc.2018.01762

Download PDF

Abstract

Document processing in natural language includes retrieval, sentiment analysis, theme extraction, etc. Classical methods for handling these tasks are based on models of probability, semantics and networks for machine learning. The probability model is loss of semantic information in essential, and it influences the processing accuracy. Machine learning approaches include supervised, unsupervised, and semi-supervised approaches, labeled corpora is necessary for semantics model and supervised learning. The method for achieving a reliably labeled corpus is done manually, it is costly and time-consuming because people have to read each document and annotate the label of each document. Recently, the continuous CBOW model is efficient for learning high-quality distributed vector representations, and it can capture a large number of precise syntactic and semantic word relationships, this model can be easily extended to learn paragraph vector, but it is not precise. Towards these problems, this paper is devoted to developing a new model for learning paragraph vector, we combine the CBOW model and CNNs to establish a new deep learning model. Experimental results show that paragraph vector generated by the new model is better than the paragraph vector generated by CBOW model in semantic relativeness and accuracy.

Keywords

Distributed word vector, distributed paragraph vector, CNNs, CBOW, deep learning.

Cite This Article

APA Style

Xiong, Z., Shen, Q., Wang, Y., , C.Z. (2018). Paragraph vector representation based on word to vector and CNN learning. Computers, Materials & Continua, 55(2), 213-227. https://doi.org/10.3970/cmc.2018.01762

Vancouver Style

Xiong Z, Shen Q, Wang Y, CZ. Paragraph vector representation based on word to vector and CNN learning. Comput Mater Contin. 2018;55(2):213-227 https://doi.org/10.3970/cmc.2018.01762

IEEE Style

Z. Xiong, Q. Shen, Y. Wang, and C.Z. "Paragraph Vector Representation Based on Word to Vector and CNN Learning," Comput. Mater. Contin., vol. 55, no. 2, pp. 213-227. 2018. https://doi.org/10.3970/cmc.2018.01762

BibTex EndNote RIS

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Paragraph Vector Representation Based on Word to Vector and CNN Learning

Abstract

Keywords

Cite This Article

2381

1881

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link