Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (132)
  • Open Access

    ARTICLE

    Research on Multimodal Brain Tumor Segmentation Algorithm Based on Feature Decoupling and Information Bottleneck Theory

    Xuemei Yang1, Yuting Zhou2, Shiqi Liu1, Junping Yin2,3,*

    CMC-Computers, Materials & Continua, Vol.82, No.2, pp. 3281-3307, 2025, DOI:10.32604/cmc.2024.057991 - 17 February 2025

    Abstract Aiming at the problems of information loss and the relationship between features and target tasks in multimodal medical image segmentation, a multimodal medical image segmentation algorithm based on feature decoupling and information bottleneck theory is proposed in this paper. Based on the reversible network, the bottom-up learning method for different modal information is constructed, which enhances the features’ expression ability and the network’s learning ability. The feature fusion module is designed to balance multi-directional information flow. To retain the information relevant to the target task to the maximum extent and suppress the information irrelevant to… More >

  • Open Access

    ARTICLE

    Text-Image Feature Fine-Grained Learning for Joint Multimodal Aspect-Based Sentiment Analysis

    Tianzhi Zhang1, Gang Zhou1,*, Shuang Zhang2, Shunhang Li1, Yepeng Sun1, Qiankun Pi1, Shuo Liu3

    CMC-Computers, Materials & Continua, Vol.82, No.1, pp. 279-305, 2025, DOI:10.32604/cmc.2024.055943 - 03 January 2025

    Abstract Joint Multimodal Aspect-based Sentiment Analysis (JMASA) is a significant task in the research of multimodal fine-grained sentiment analysis, which combines two subtasks: Multimodal Aspect Term Extraction (MATE) and Multimodal Aspect-oriented Sentiment Classification (MASC). Currently, most existing models for JMASA only perform text and image feature encoding from a basic level, but often neglect the in-depth analysis of unimodal intrinsic features, which may lead to the low accuracy of aspect term extraction and the poor ability of sentiment prediction due to the insufficient learning of intra-modal features. Given this problem, we propose a Text-Image Feature Fine-grained… More >

  • Open Access

    ARTICLE

    Adjusted Reasoning Module for Deep Visual Question Answering Using Vision Transformer

    Christine Dewi1,3, Hanna Prillysca Chernovita2, Stephen Abednego Philemon1, Christian Adi Ananta1, Abbott Po Shun Chen4,*

    CMC-Computers, Materials & Continua, Vol.81, No.3, pp. 4195-4216, 2024, DOI:10.32604/cmc.2024.057453 - 19 December 2024

    Abstract Visual Question Answering (VQA) is an interdisciplinary artificial intelligence (AI) activity that integrates computer vision and natural language processing. Its purpose is to empower machines to respond to questions by utilizing visual information. A VQA system typically takes an image and a natural language query as input and produces a textual answer as output. One major obstacle in VQA is identifying a successful method to extract and merge textual and visual data. We examine “Fusion” Models that use information from both the text encoder and picture encoder to efficiently perform the visual question-answering challenge. For More >

  • Open Access

    ARTICLE

    MDD: A Unified Multimodal Deep Learning Approach for Depression Diagnosis Based on Text and Audio Speech

    Farah Mohammad1,2,*, Khulood Mohammed Al Mansoor3

    CMC-Computers, Materials & Continua, Vol.81, No.3, pp. 4125-4147, 2024, DOI:10.32604/cmc.2024.056666 - 19 December 2024

    Abstract Depression is a prevalent mental health issue affecting individuals of all age groups globally. Similar to other mental health disorders, diagnosing depression presents significant challenges for medical practitioners and clinical experts, primarily due to societal stigma and a lack of awareness and acceptance. Although medical interventions such as therapies, medications, and brain stimulation therapy provide hope for treatment, there is still a gap in the efficient detection of depression. Traditional methods, like in-person therapies, are both time-consuming and labor-intensive, emphasizing the necessity for technological assistance, especially through Artificial Intelligence. Alternative to this, in most cases… More >

  • Open Access

    ARTICLE

    Image Captioning Using Multimodal Deep Learning Approach

    Rihem Farkh1,*, Ghislain Oudinet1, Yasser Foued2

    CMC-Computers, Materials & Continua, Vol.81, No.3, pp. 3951-3968, 2024, DOI:10.32604/cmc.2024.053245 - 19 December 2024

    Abstract The process of generating descriptive captions for images has witnessed significant advancements in last years, owing to the progress in deep learning techniques. Despite significant advancements, the task of thoroughly grasping image content and producing coherent, contextually relevant captions continues to pose a substantial challenge. In this paper, we introduce a novel multimodal method for image captioning by integrating three powerful deep learning architectures: YOLOv8 (You Only Look Once) for robust object detection, EfficientNetB7 for efficient feature extraction, and Transformers for effective sequence modeling. Our proposed model combines the strengths of YOLOv8 in detecting objects,… More >

  • Open Access

    ARTICLE

    Evaluation d’une stratégie multimodale d’information et de recueil des directives anticipées dans un Centre de Lutte Contre le Cancer : description du protocole de l’étude

    Léonor Fasse1,2,*, François Blot2

    Psycho-Oncologie, Vol.18, No.4, pp. 367-375, 2024, DOI:10.32604/po.2024.049544 - 04 December 2024

    Abstract Arrière-plan: L’information sur la possibilité de rédiger des Directives Anticipées (DA) est une nécessité et représente un enjeu majeur sur les plans médical, éthique, légal. Les difficultés sont nombreuses, organisationnelles ou culturelles, et c’est aussi vrai dans l’univers de la cancérologie, où les DA (et plus largement les discussions anticipées) revêtent une importance cruciale. Sujet éminemment sensible, l’abord des DA (donc d’une réflexion sur la fin de vie) nécessite un travail d’acculturation à la fois sociétale et médico-soignante. Une démarche institutionnelle a donc été élaborée, dans l’objectif d’un déploiement d’outils d’information, de formation des professionnels,… More >

  • Open Access

    ARTICLE

    A Recurrent Neural Network for Multimodal Anomaly Detection by Using Spatio-Temporal Audio-Visual Data

    Sameema Tariq1, Ata-Ur- Rehman2,3, Maria Abubakar2, Waseem Iqbal4, Hatoon S. Alsagri5, Yousef A. Alduraywish5, Haya Abdullah A. Alhakbani5,*

    CMC-Computers, Materials & Continua, Vol.81, No.2, pp. 2493-2515, 2024, DOI:10.32604/cmc.2024.055787 - 18 November 2024

    Abstract In video surveillance, anomaly detection requires training machine learning models on spatio-temporal video sequences. However, sometimes the video-only data is not sufficient to accurately detect all the abnormal activities. Therefore, we propose a novel audio-visual spatiotemporal autoencoder specifically designed to detect anomalies for video surveillance by utilizing audio data along with video data. This paper presents a competitive approach to a multi-modal recurrent neural network for anomaly detection that combines separate spatial and temporal autoencoders to leverage both spatial and temporal features in audio-visual data. The proposed model is trained to produce low reconstruction error… More >

  • Open Access

    ARTICLE

    Efficient User Identity Linkage Based on Aligned Multimodal Features and Temporal Correlation

    Jiaqi Gao1, Kangfeng Zheng1,*, Xiujuan Wang2, Chunhua Wu1, Bin Wu2

    CMC-Computers, Materials & Continua, Vol.81, No.1, pp. 251-270, 2024, DOI:10.32604/cmc.2024.055560 - 15 October 2024

    Abstract User identity linkage (UIL) refers to identifying user accounts belonging to the same identity across different social media platforms. Most of the current research is based on text analysis, which fails to fully explore the rich image resources generated by users, and the existing attempts touch on the multimodal domain, but still face the challenge of semantic differences between text and images. Given this, we investigate the UIL task across different social media platforms based on multimodal user-generated contents (UGCs). We innovatively introduce the efficient user identity linkage via aligned multi-modal features and temporal correlation… More >

  • Open Access

    REVIEW

    Evolution and Prospects of Foundation Models: From Large Language Models to Large Multimodal Models

    Zheyi Chen1,#, Liuchang Xu1,#, Hongting Zheng1, Luyao Chen1, Amr Tolba2,3, Liang Zhao4, Keping Yu5,*, Hailin Feng1,*

    CMC-Computers, Materials & Continua, Vol.80, No.2, pp. 1753-1808, 2024, DOI:10.32604/cmc.2024.052618 - 15 August 2024

    Abstract Since the 1950s, when the Turing Test was introduced, there has been notable progress in machine language intelligence. Language modeling, crucial for AI development, has evolved from statistical to neural models over the last two decades. Recently, transformer-based Pre-trained Language Models (PLM) have excelled in Natural Language Processing (NLP) tasks by leveraging large-scale training corpora. Increasing the scale of these models enhances performance significantly, introducing abilities like context learning that smaller models lack. The advancement in Large Language Models, exemplified by the development of ChatGPT, has made significant impacts both academically and industrially, capturing widespread… More >

  • Open Access

    ARTICLE

    GAN-DIRNet: A Novel Deformable Image Registration Approach for Multimodal Histological Images

    Haiyue Li1, Jing Xie2, Jing Ke3, Ye Yuan1, Xiaoyong Pan1, Hongyi Xin4, Hongbin Shen1,*

    CMC-Computers, Materials & Continua, Vol.80, No.1, pp. 487-506, 2024, DOI:10.32604/cmc.2024.049640 - 18 July 2024

    Abstract Multi-modal histological image registration tasks pose significant challenges due to tissue staining operations causing partial loss and folding of tissue. Convolutional neural network (CNN) and generative adversarial network (GAN) are pivotal in medical image registration. However, existing methods often struggle with severe interference and deformation, as seen in histological images of conditions like Cushing’s disease. We argue that the failure of current approaches lies in underutilizing the feature extraction capability of the discriminator in GAN. In this study, we propose a novel multi-modal registration approach GAN-DIRNet based on GAN for deformable histological image registration. To… More >

Displaying 51-60 on page 6 of 132. Per Page