Tech Science Press - Publisher of Open Access Journals

Open Access

ARTICLE

A Video Captioning Method by Semantic Topic-Guided Generation

Ou Ye, Xinli Wei, Zhenhua Yu^*, Yan Fu, Ying Yang

CMC-Computers, Materials & Continua, Vol.78, No.1, pp. 1071-1093, 2024, DOI:10.32604/cmc.2023.046418

Abstract In the video captioning methods based on an encoder-decoder, limited visual features are extracted by an encoder, and a natural sentence of the video content is generated using a decoder. However, this kind of method is dependent on a single video input source and few visual labels, and there is a problem with semantic alignment between video contents and generated natural sentences, which are not suitable for accurately comprehending and describing the video contents. To address this issue, this paper proposes a video captioning method by semantic topic-guided generation. First, a 3D convolutional neural network… More >

Open Access

ARTICLE

A Method of Integrating Length Constraints into Encoder-Decoder Transformer for Abstractive Text Summarization

Ngoc-Khuong Nguyen^1,2, Dac-Nhuong Le¹, Viet-Ha Nguyen², Anh-Cuong Le^3,*

Intelligent Automation & Soft Computing, Vol.38, No.1, pp. 1-18, 2023, DOI:10.32604/iasc.2023.037083

Abstract Text summarization aims to generate a concise version of the original text. The longer the summary text is, the more detailed it will be from the original text, and this depends on the intended use. Therefore, the problem of generating summary texts with desired lengths is a vital task to put the research into practice. To solve this problem, in this paper, we propose a new method to integrate the desired length of the summarized text into the encoder-decoder model for the abstractive text summarization problem. This length parameter is integrated into the encoding phase More >

Open Access

ARTICLE

Optimizing Fully Convolutional Encoder-Decoder Network for Segmentation of Diabetic Eye Disease

Abdul Qadir Khan¹, Guangmin Sun^1,*, Yu Li¹, Anas Bilal², Malik Abdul Manan¹

CMC-Computers, Materials & Continua, Vol.77, No.2, pp. 2481-2504, 2023, DOI:10.32604/cmc.2023.043239

Abstract In the emerging field of image segmentation, Fully Convolutional Networks (FCNs) have recently become prominent. However, their effectiveness is intimately linked with the correct selection and fine-tuning of hyperparameters, which can often be a cumbersome manual task. The main aim of this study is to propose a more efficient, less labour-intensive approach to hyperparameter optimization in FCNs for segmenting fundus images. To this end, our research introduces a hyperparameter-optimized Fully Convolutional Encoder-Decoder Network (FCEDN). The optimization is handled by a novel Genetic Grey Wolf Optimization (G-GWO) algorithm. This algorithm employs the Genetic Algorithm (GA) to… More >

Open Access

ARTICLE

Traffic Scene Captioning with Multi-Stage Feature Enhancement

Dehai Zhang^*, Yu Ma, Qing Liu, Haoxing Wang, Anquan Ren, Jiashu Liang

CMC-Computers, Materials & Continua, Vol.76, No.3, pp. 2901-2920, 2023, DOI:10.32604/cmc.2023.038264

Abstract Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images, ensuring road safety while providing an important decision-making function for sustainable transportation. In order to provide a comprehensive and reasonable description of complex traffic scenes, a traffic scene semantic captioning model with multi-stage feature enhancement is proposed in this paper. In general, the model follows an encoder-decoder structure. First, multi-level granularity visual features are used for feature enhancement during the encoding process, which enables the model to learn… More >

Open Access

ARTICLE

A Sentence Retrieval Generation Network Guided Video Captioning

Ou Ye^1,2, Mimi Wang¹, Zhenhua Yu^1,*, Yan Fu¹, Shun Yi¹, Jun Deng²

CMC-Computers, Materials & Continua, Vol.75, No.3, pp. 5675-5696, 2023, DOI:10.32604/cmc.2023.037503

Abstract Currently, the video captioning models based on an encoder-decoder mainly rely on a single video input source. The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning, which is not conducive to the accurate description and understanding of video content. To address this issue, a novel video captioning method guided by a sentence retrieval generation network (ED-SRG) is proposed in this paper. First, a ResNeXt network model, an efficient convolutional network for online video understanding (ECO) model, and a long short-term memory (LSTM) network… More >

Open Access

ARTICLE

A Novel Detection Method for Pavement Crack with Encoder-Decoder Architecture

Yalong Yang^1,2,3, Wenjing Xu^1,2,3, Yinfeng Zhu⁴, Liangliang Su^1,2,3,*, Gongquan Zhang^1,2,3

CMES-Computer Modeling in Engineering & Sciences, Vol.137, No.1, pp. 761-773, 2023, DOI:10.32604/cmes.2023.027010

Abstract As a current popular method, intelligent detection of cracks is of great significance to road safety, so deep learning has gradually attracted attention in the field of crack image detection. The nonlinear structure, low contrast and discontinuity of cracks bring great challenges to existing crack detection methods based on deep learning. Therefore, an end-to-end deep convolutional neural network (AttentionCrack) is proposed for automatic crack detection to overcome the inaccuracy of boundary location between crack and non-crack pixels. The AttentionCrack network is built on U-Net based encoder-decoder architecture, and an attention mechanism is incorporated into the… More >

Open Access

ARTICLE

Semantic Segmentation by Using Down-Sampling and Subpixel Convolution: DSSC-UNet

Young-Man Kwon, Sunghoon Bae, Dong-Keun Chung, Myung-Jae Lim^*

CMC-Computers, Materials & Continua, Vol.75, No.1, pp. 683-696, 2023, DOI:10.32604/cmc.2023.033370

Abstract Recently, semantic segmentation has been widely applied to image processing, scene understanding, and many others. Especially, in deep learning-based semantic segmentation, the U-Net with convolutional encoder-decoder architecture is a representative model which is proposed for image segmentation in the biomedical field. It used max pooling operation for reducing the size of image and making noise robust. However, instead of reducing the complexity of the model, max pooling has the disadvantage of omitting some information about the image in reducing it. So, this paper used two diagonal elements of down-sampling operation instead of it. We think… More >

Open Access

ARTICLE

An Improved Encoder-Decoder CNN with Region-Based Filtering for Vibrant Colorization

Mrityunjoy Gain¹, Md Arifur Rahman¹, Rameswar Debnath¹, Mrim M. Alnfiai², Abdullah Sheikh³, Mehedi Masud³, Anupam Kumar Bairagi^1,*

Computer Systems Science and Engineering, Vol.46, No.1, pp. 1059-1077, 2023, DOI:10.32604/csse.2023.034809

Abstract Colorization is the practice of adding appropriate chromatic values to monochrome photographs or videos. A real-valued luminance image can be mapped to a three-dimensional color image. However, it is a severely ill-defined problem and not has a single solution. In this paper, an encoder-decoder Convolutional Neural Network (CNN) model is used for colorizing gray images where the encoder is a Densely Connected Convolutional Network (DenseNet) and the decoder is a conventional CNN. The DenseNet extracts image features from gray images and the conventional CNN outputs a * b * color channels. Due to a large number of desaturated… More >

Open Access

ARTICLE

An End-to-End Transformer-Based Automatic Speech Recognition for Qur’an Reciters

Mohammed Hadwan^1,2,*, Hamzah A. Alsayadi^3,4, Salah AL-Hagree⁵

CMC-Computers, Materials & Continua, Vol.74, No.2, pp. 3471-3487, 2023, DOI:10.32604/cmc.2023.033457

Abstract The attention-based encoder-decoder technique, known as the trans-former, is used to enhance the performance of end-to-end automatic speech recognition (ASR). This research focuses on applying ASR end-to-end transformer-based models for the Arabic language, as the researchers’ community pays little attention to it. The Muslims Holy Qur’an book is written using Arabic diacritized text. In this paper, an end-to-end transformer model to building a robust Qur’an vs. recognition is proposed. The acoustic model was built using the transformer-based model as deep learning by the PyTorch framework. A multi-head attention mechanism is utilized to represent the encoder and… More >

Open Access

ARTICLE

A Dual Attention Encoder-Decoder Text Summarization Model

Nada Ali Hakami¹, Hanan Ahmed Hosni Mahmoud^2,*

CMC-Computers, Materials & Continua, Vol.74, No.2, pp. 3697-3710, 2023, DOI:10.32604/cmc.2023.031525

Abstract A worthy text summarization should represent the fundamental content of the document. Recent studies on computerized text summarization tried to present solutions to this challenging problem. Attention models are employed extensively in text summarization process. Classical attention techniques are utilized to acquire the context data in the decoding phase. Nevertheless, without real and efficient feature extraction, the produced summary may diverge from the core topic. In this article, we present an encoder-decoder attention system employing dual attention mechanism. In the dual attention mechanism, the attention algorithm gathers main data from the encoder side. In the More >

Displaying 1-10 on page 1 of 23. Per Page

View

391

Download

198

Like

1

View

799

Download

267

Like

0

View

556

Download

227

Like

1

View

462

Download

230

Like

0

View

527

Download

353

Like

0

View

677

Download

521

Like

0

View

1006

Download

535

Like

0

View

1137

Download

438

Like

0

View

1090

Download

483

Like

0

View

813

Download

452

Like

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp: