Tech Science Press - Publisher of Open Access Journals

Open Access

ARTICLE

Research on Multimodal Brain Tumor Segmentation Algorithm Based on Feature Decoupling and Information Bottleneck Theory

Xuemei Yang¹, Yuting Zhou², Shiqi Liu¹, Junping Yin^2,3,*

CMC-Computers, Materials & Continua, Vol.82, No.2, pp. 3281-3307, 2025, DOI:10.32604/cmc.2024.057991 - 17 February 2025

Abstract Aiming at the problems of information loss and the relationship between features and target tasks in multimodal medical image segmentation, a multimodal medical image segmentation algorithm based on feature decoupling and information bottleneck theory is proposed in this paper. Based on the reversible network, the bottom-up learning method for different modal information is constructed, which enhances the features’ expression ability and the network’s learning ability. The feature fusion module is designed to balance multi-directional information flow. To retain the information relevant to the target task to the maximum extent and suppress the information irrelevant to… More >

Open Access

ARTICLE

Text-Image Feature Fine-Grained Learning for Joint Multimodal Aspect-Based Sentiment Analysis

Tianzhi Zhang¹, Gang Zhou^1,*, Shuang Zhang², Shunhang Li¹, Yepeng Sun¹, Qiankun Pi¹, Shuo Liu³

CMC-Computers, Materials & Continua, Vol.82, No.1, pp. 279-305, 2025, DOI:10.32604/cmc.2024.055943 - 03 January 2025

Abstract Joint Multimodal Aspect-based Sentiment Analysis (JMASA) is a significant task in the research of multimodal fine-grained sentiment analysis, which combines two subtasks: Multimodal Aspect Term Extraction (MATE) and Multimodal Aspect-oriented Sentiment Classification (MASC). Currently, most existing models for JMASA only perform text and image feature encoding from a basic level, but often neglect the in-depth analysis of unimodal intrinsic features, which may lead to the low accuracy of aspect term extraction and the poor ability of sentiment prediction due to the insufficient learning of intra-modal features. Given this problem, we propose a Text-Image Feature Fine-grained… More >

Open Access

ARTICLE

Adjusted Reasoning Module for Deep Visual Question Answering Using Vision Transformer

Christine Dewi^1,3, Hanna Prillysca Chernovita², Stephen Abednego Philemon¹, Christian Adi Ananta¹, Abbott Po Shun Chen^4,*

CMC-Computers, Materials & Continua, Vol.81, No.3, pp. 4195-4216, 2024, DOI:10.32604/cmc.2024.057453 - 19 December 2024

Abstract Visual Question Answering (VQA) is an interdisciplinary artificial intelligence (AI) activity that integrates computer vision and natural language processing. Its purpose is to empower machines to respond to questions by utilizing visual information. A VQA system typically takes an image and a natural language query as input and produces a textual answer as output. One major obstacle in VQA is identifying a successful method to extract and merge textual and visual data. We examine “Fusion” Models that use information from both the text encoder and picture encoder to efficiently perform the visual question-answering challenge. For More >

Open Access

ARTICLE

MDD: A Unified Multimodal Deep Learning Approach for Depression Diagnosis Based on Text and Audio Speech

Farah Mohammad^1,2,*, Khulood Mohammed Al Mansoor³

CMC-Computers, Materials & Continua, Vol.81, No.3, pp. 4125-4147, 2024, DOI:10.32604/cmc.2024.056666 - 19 December 2024

Abstract Depression is a prevalent mental health issue affecting individuals of all age groups globally. Similar to other mental health disorders, diagnosing depression presents significant challenges for medical practitioners and clinical experts, primarily due to societal stigma and a lack of awareness and acceptance. Although medical interventions such as therapies, medications, and brain stimulation therapy provide hope for treatment, there is still a gap in the efficient detection of depression. Traditional methods, like in-person therapies, are both time-consuming and labor-intensive, emphasizing the necessity for technological assistance, especially through Artificial Intelligence. Alternative to this, in most cases… More >

Open Access

ARTICLE

Image Captioning Using Multimodal Deep Learning Approach

Rihem Farkh^1,*, Ghislain Oudinet¹, Yasser Foued²

CMC-Computers, Materials & Continua, Vol.81, No.3, pp. 3951-3968, 2024, DOI:10.32604/cmc.2024.053245 - 19 December 2024

Abstract The process of generating descriptive captions for images has witnessed significant advancements in last years, owing to the progress in deep learning techniques. Despite significant advancements, the task of thoroughly grasping image content and producing coherent, contextually relevant captions continues to pose a substantial challenge. In this paper, we introduce a novel multimodal method for image captioning by integrating three powerful deep learning architectures: YOLOv8 (You Only Look Once) for robust object detection, EfficientNetB7 for efficient feature extraction, and Transformers for effective sequence modeling. Our proposed model combines the strengths of YOLOv8 in detecting objects,… More >

Open Access

ARTICLE

Evaluation d’une stratégie multimodale d’information et de recueil des directives anticipées dans un Centre de Lutte Contre le Cancer : description du protocole de l’étude

Léonor Fasse^1,2,*, François Blot²

Psycho-Oncologie, Vol.18, No.4, pp. 367-375, 2024, DOI:10.32604/po.2024.049544 - 04 December 2024

Abstract Arrière-plan: L’information sur la possibilité de rédiger des Directives Anticipées (DA) est une nécessité et représente un enjeu majeur sur les plans médical, éthique, légal. Les difficultés sont nombreuses, organisationnelles ou culturelles, et c’est aussi vrai dans l’univers de la cancérologie, où les DA (et plus largement les discussions anticipées) revêtent une importance cruciale. Sujet éminemment sensible, l’abord des DA (donc d’une réflexion sur la fin de vie) nécessite un travail d’acculturation à la fois sociétale et médico-soignante. Une démarche institutionnelle a donc été élaborée, dans l’objectif d’un déploiement d’outils d’information, de formation des professionnels,… More >

Open Access

ARTICLE

A Recurrent Neural Network for Multimodal Anomaly Detection by Using Spatio-Temporal Audio-Visual Data

Sameema Tariq¹, Ata-Ur- Rehman^2,3, Maria Abubakar², Waseem Iqbal⁴, Hatoon S. Alsagri⁵, Yousef A. Alduraywish⁵, Haya Abdullah A. Alhakbani^5,*

CMC-Computers, Materials & Continua, Vol.81, No.2, pp. 2493-2515, 2024, DOI:10.32604/cmc.2024.055787 - 18 November 2024

Abstract In video surveillance, anomaly detection requires training machine learning models on spatio-temporal video sequences. However, sometimes the video-only data is not sufficient to accurately detect all the abnormal activities. Therefore, we propose a novel audio-visual spatiotemporal autoencoder specifically designed to detect anomalies for video surveillance by utilizing audio data along with video data. This paper presents a competitive approach to a multi-modal recurrent neural network for anomaly detection that combines separate spatial and temporal autoencoders to leverage both spatial and temporal features in audio-visual data. The proposed model is trained to produce low reconstruction error… More >

Open Access

ARTICLE

Efficient User Identity Linkage Based on Aligned Multimodal Features and Temporal Correlation

Jiaqi Gao¹, Kangfeng Zheng^1,*, Xiujuan Wang², Chunhua Wu¹, Bin Wu²

CMC-Computers, Materials & Continua, Vol.81, No.1, pp. 251-270, 2024, DOI:10.32604/cmc.2024.055560 - 15 October 2024

Abstract User identity linkage (UIL) refers to identifying user accounts belonging to the same identity across different social media platforms. Most of the current research is based on text analysis, which fails to fully explore the rich image resources generated by users, and the existing attempts touch on the multimodal domain, but still face the challenge of semantic differences between text and images. Given this, we investigate the UIL task across different social media platforms based on multimodal user-generated contents (UGCs). We innovatively introduce the efficient user identity linkage via aligned multi-modal features and temporal correlation… More >

Evolution and Prospects of Foundation Models: From Large Language Models to Large Multimodal Models

Zheyi Chen^1,#, Liuchang Xu^1,#, Hongting Zheng¹, Luyao Chen¹, Amr Tolba^2,3, Liang Zhao⁴, Keping Yu^5,*, Hailin Feng^1,*

CMC-Computers, Materials & Continua, Vol.80, No.2, pp. 1753-1808, 2024, DOI:10.32604/cmc.2024.052618 - 15 August 2024

Abstract Since the 1950s, when the Turing Test was introduced, there has been notable progress in machine language intelligence. Language modeling, crucial for AI development, has evolved from statistical to neural models over the last two decades. Recently, transformer-based Pre-trained Language Models (PLM) have excelled in Natural Language Processing (NLP) tasks by leveraging large-scale training corpora. Increasing the scale of these models enhances performance significantly, introducing abilities like context learning that smaller models lack. The advancement in Large Language Models, exemplified by the development of ChatGPT, has made significant impacts both academically and industrially, capturing widespread… More >

Open Access

ARTICLE

GAN-DIRNet: A Novel Deformable Image Registration Approach for Multimodal Histological Images

Haiyue Li¹, Jing Xie², Jing Ke³, Ye Yuan¹, Xiaoyong Pan¹, Hongyi Xin⁴, Hongbin Shen^1,*

CMC-Computers, Materials & Continua, Vol.80, No.1, pp. 487-506, 2024, DOI:10.32604/cmc.2024.049640 - 18 July 2024

Abstract Multi-modal histological image registration tasks pose significant challenges due to tissue staining operations causing partial loss and folding of tissue. Convolutional neural network (CNN) and generative adversarial network (GAN) are pivotal in medical image registration. However, existing methods often struggle with severe interference and deformation, as seen in histological images of conditions like Cushing’s disease. We argue that the failure of current approaches lies in underutilizing the feature extraction capability of the discriminator in GAN. In this study, we propose a novel multi-modal registration approach GAN-DIRNet based on GAN for deformable histological image registration. To… More >

Displaying 51-60 on page 6 of 132. Per Page

View

1127

Download

685

View

1804

Download

850

View

1723

Download

799

View

2488

Download

1828

View

1952

Download

921

View

1598

Download

920

View

2494

Download

2354

View

1278

Download

653

View

8584

Download

3438

View

1590

Download

876

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp: