Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (4)
  • Open Access

    ARTICLE

    An Optimized Chinese Filtering Model Using Value Scale Extended Text Vector

    Siyu Lu1, Ligao Cai1, Zhixin Liu2, Shan Liu1, Bo Yang1, Lirong Yin3, Mingzhe Liu4, Wenfeng Zheng1,*

    Computer Systems Science and Engineering, Vol.47, No.2, pp. 1881-1899, 2023, DOI:10.32604/csse.2023.034853

    Abstract With the development of Internet technology, the explosive growth of Internet information presentation has led to difficulty in filtering effective information. Finding a model with high accuracy for text classification has become a critical problem to be solved by text filtering, especially for Chinese texts. This paper selected the manually calibrated Douban movie website comment data for research. First, a text filtering model based on the BP neural network has been built; Second, based on the Term Frequency-Inverse Document Frequency (TF-IDF) vector space model and the doc2vec method, the text word frequency vector and the text semantic vector were obtained… More >

  • Open Access

    ARTICLE

    Deep Learning Algorithm for Detection of Protein Remote Homology

    Fahriye Gemci1,*, Turgay Ibrikci2, Ulus Cevik3

    Computer Systems Science and Engineering, Vol.46, No.3, pp. 3703-3713, 2023, DOI:10.32604/csse.2023.032706

    Abstract The study aims to find a successful solution by using computer algorithms to detect remote homologous proteins, which is a significant problem in the bioinformatics field. In this experimental study, structural classification of proteins (SCOP) 1.53, SCOP benchmark, and the newly created SCOP protein database from the structural classification of proteins—extended (SCOPe) 2.07 were used to detect remote homolog proteins. N-gram method and then Term Frequency-Inverse Document Frequency (TF-IDF) weighting were performed to extract features of the protein sequences taken from these databases. Next, a smoothing process on the obtained features was performed to avoid misclassification. Finally, the proteins with… More >

  • Open Access

    ARTICLE

    Chinese News Text Classification Based on Convolutional Neural Network

    Hanxu Wang, Xin Li*

    Journal on Big Data, Vol.4, No.1, pp. 41-60, 2022, DOI:10.32604/jbd.2022.027717

    Abstract With the explosive growth of Internet text information, the task of text classification is more important. As a part of text classification, Chinese news text classification also plays an important role. In public security work, public opinion news classification is an important topic. Effective and accurate classification of public opinion news is a necessary prerequisite for relevant departments to grasp the situation of public opinion and control the trend of public opinion in time. This paper introduces a combined-convolutional neural network text classification model based on word2vec and improved TF-IDF: firstly, the word vector is trained through word2vec model, then… More >

  • Open Access

    ARTICLE

    News Text Topic Clustering Optimized Method Based on TF-IDF Algorithm on Spark

    Zhuo Zhou1, Jiaohua Qin1,*, Xuyu Xiang1, Yun Tan1, Qiang Liu1, Neal N. Xiong2

    CMC-Computers, Materials & Continua, Vol.62, No.1, pp. 217-231, 2020, DOI:10.32604/cmc.2020.06431

    Abstract Due to the slow processing speed of text topic clustering in stand-alone architecture under the background of big data, this paper takes news text as the research object and proposes LDA text topic clustering algorithm based on Spark big data platform. Since the TF-IDF (term frequency-inverse document frequency) algorithm under Spark is irreversible to word mapping, the mapped words indexes cannot be traced back to the original words. In this paper, an optimized method is proposed that TF-IDF under Spark to ensure the text words can be restored. Firstly, the text feature is extracted by the TF-IDF algorithm combined CountVectorizer… More >

Displaying 1-10 on page 1 of 4. Per Page