Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (15)
  • Open Access

    ARTICLE

    Benchmarking Performance of Document Level Classification and Topic Modeling

    Muhammad Shahid Bhatti1,*, Azmat Ullah1, Rohaya Latip2, Abid Sohail1, Anum Riaz1, Rohail Hassan3

    CMC-Computers, Materials & Continua, Vol.71, No.1, pp. 125-141, 2022, DOI:10.32604/cmc.2022.020083

    Abstract Text classification of low resource language is always a trivial and challenging problem. This paper discusses the process of Urdu news classification and Urdu documents similarity. Urdu is one of the most famous spoken languages in Asia. The implementation of computational methodologies for text classification has increased over time. However, Urdu language has not much experimented with research, it does not have readily available datasets, which turn out to be the primary reason behind limited research and applying the latest methodologies to the Urdu. To overcome these obstacles, a medium-sized dataset having six categories is collected from authentic Pakistani news… More >

  • Open Access

    ARTICLE

    Educational Videos Subtitles’ Summarization Using Latent Dirichlet Allocation and Length Enhancement

    Sarah S. Alrumiah*, Amal A. Al-Shargabi

    CMC-Computers, Materials & Continua, Vol.70, No.3, pp. 6205-6221, 2022, DOI:10.32604/cmc.2022.021780

    Abstract Nowadays, people use online resources such as educational videos and courses. However, such videos and courses are mostly long and thus, summarizing them will be valuable. The video contents (visual, audio, and subtitles) could be analyzed to generate textual summaries, i.e., notes. Videos’ subtitles contain significant information. Therefore, summarizing subtitles is effective to concentrate on the necessary details. Most of the existing studies used Term Frequency–Inverse Document Frequency (TF-IDF) and Latent Semantic Analysis (LSA) models to create lectures’ summaries. This study takes another approach and applies Latent Dirichlet Allocation (LDA), which proved its effectiveness in document summarization. Specifically, the proposed… More >

  • Open Access

    ARTICLE

    Mining Syndrome Differentiating Principles from Traditional Chinese Medicine Clinical Data

    Jialin Ma1,*, Zhaojun Wang2, Hai Guo3, Qian Xie1,4, Tao Wang4, Bolun Chen5

    Computer Systems Science and Engineering, Vol.40, No.3, pp. 979-993, 2022, DOI:10.32604/csse.2022.016759

    Abstract Syndrome differentiation-based treatment is one of the key characteristics of Traditional Chinese Medicine (TCM). The process of syndrome differentiation is difficult and challenging due to its complexity, diversity and vagueness. Analyzing syndrome principles from historical records of TCM using data mining (DM) technology has been of high interest in recent years. Nevertheless, in most relevant studies, existing DM algorithms have been simply developed for TCM mining, while the combination of TCM theories or its characteristics with DM algorithms has rarely been reported. This paper presents a novel Symptom-Syndrome Topic Model (SSTM), which is a supervised probabilistic topic model with three-tier… More >

  • Open Access

    ARTICLE

    What is Discussed about COVID-19: A Multi-Modal Framework for Analyzing Microblogs from Sina Weibo without Human Labeling

    Hengyang Lu1, *, Yutong Lou2, Bin Jin2, Ming Xu2

    CMC-Computers, Materials & Continua, Vol.64, No.3, pp. 1453-1471, 2020, DOI:10.32604/cmc.2020.011270

    Abstract Starting from late 2019, the new coronavirus disease (COVID-19) has become a global crisis. With the development of online social media, people prefer to express their opinions and discuss the latest news online. We have witnessed the positive influence of online social media, which helped citizens and governments track the development of this pandemic in time. It is necessary to apply artificial intelligence (AI) techniques to online social media and automatically discover and track public opinions posted online. In this paper, we take Sina Weibo, the most widely used online social media in China, for analysis and experiments. We collect… More >

  • Open Access

    ARTICLE

    A Phrase Topic Model Based on Distributed Representation

    Jialin Ma1, *, Jieyi Cheng1, Lin Zhang1, Lei Zhou1, Bolun Chen1, 2

    CMC-Computers, Materials & Continua, Vol.64, No.1, pp. 455-469, 2020, DOI:10.32604/cmc.2020.09780

    Abstract Traditional topic models have been widely used for analyzing semantic topics from electronic documents. However, the obvious defects of topic words acquired by them are poor in readability and consistency. Only the domain experts are possible to guess their meaning. In fact, phrases are the main unit for people to express semantics. This paper presents a Distributed Representation-Phrase Latent Dirichlet Allocation (DRPhrase LDA) which is a phrase topic model. Specifically, we reasonably enhance the semantic information of phrases via distributed representation in this model. The experimental results show the topics quality acquired by our model is more readable and consistent… More >

Displaying 11-20 on page 2 of 15. Per Page