Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (284)
  • Open Access


    An improved CRNN for Vietnamese Identity Card Information Recognition

    Trinh Tan Dat1, Le Tran Anh Dang1,2, Nguyen Nhat Truong1,2, Pham Cung Le Thien Vu1, Vu Ngoc Thanh Sang1, Pham Thi Vuong1, Pham The Bao1,*

    Computer Systems Science and Engineering, Vol.40, No.2, pp. 539-555, 2022, DOI:10.32604/csse.2022.019064

    Abstract This paper proposes an enhancement of an automatic text recognition system for extracting information from the front side of the Vietnamese citizen identity (CID) card. First, we apply Mask-RCNN to segment and align the CID card from the background. Next, we present two approaches to detect the CID card’s text lines using traditional image processing techniques compared to the EAST detector. Finally, we introduce a new end-to-end Convolutional Recurrent Neural Network (CRNN) model based on a combination of Connectionist Temporal Classification (CTC) and attention mechanism for Vietnamese text recognition by jointly train the CTC and attention objective functions together. The… More >

  • Open Access


    A Position-Aware Transformer for Image Captioning

    Zelin Deng1,*, Bo Zhou1, Pei He2, Jianfeng Huang3, Osama Alfarraj4, Amr Tolba4,5

    CMC-Computers, Materials & Continua, Vol.70, No.1, pp. 2065-2081, 2022, DOI:10.32604/cmc.2022.019328

    Abstract Image captioning aims to generate a corresponding description of an image. In recent years, neural encoder-decoder models have been the dominant approaches, in which the Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) are used to translate an image into a natural language description. Among these approaches, the visual attention mechanisms are widely used to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning. However, most conventional visual attention mechanisms are based on high-level image features, ignoring the effects of other image features, and giving insufficient consideration to the relative positions between image features.… More >

  • Open Access


    Attention-Based and Time Series Models for Short-Term Forecasting of COVID-19 Spread

    Jurgita Markevičiūtė1,*, Jolita Bernatavičienė2, Rūta Levulienė1, Viktor Medvedev2, Povilas Treigys2, Julius Venskus2

    CMC-Computers, Materials & Continua, Vol.70, No.1, pp. 695-714, 2022, DOI:10.32604/cmc.2022.018735

    Abstract The growing number of COVID-19 cases puts pressure on healthcare services and public institutions worldwide. The pandemic has brought much uncertainty to the global economy and the situation in general. Forecasting methods and modeling techniques are important tools for governments to manage critical situations caused by pandemics, which have negative impact on public health. The main purpose of this study is to obtain short-term forecasts of disease epidemiology that could be useful for policymakers and public institutions to make necessary short-term decisions. To evaluate the effectiveness of the proposed attention-based method combining certain data mining algorithms and the classical ARIMA… More >

  • Open Access


    A Multi-Feature Learning Model with Enhanced Local Attention for Vehicle Re-Identification

    Wei Sun1,2,*, Xuan Chen3, Xiaorui Zhang1,3, Guangzhao Dai2, Pengshuai Chang2, Xiaozheng He4

    CMC-Computers, Materials & Continua, Vol.69, No.3, pp. 3549-3561, 2021, DOI:10.32604/cmc.2021.021627

    Abstract Vehicle re-identification (ReID) aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario. It has gradually become a core technology of intelligent transportation system. Most existing vehicle re-identification models adopt the joint learning of global and local features. However, they directly use the extracted global features, resulting in insufficient feature expression. Moreover, local features are primarily obtained through advanced annotation and complex attention mechanisms, which require additional costs. To solve this issue, a multi-feature learning model with enhanced local attention for vehicle re-identification (MFELA) is proposed in this paper.… More >

  • Open Access


    Visual Saliency Prediction Using Attention-based Cross-modal Integration Network in RGB-D Images

    Xinyue Zhang1, Ting Jin1,*, Mingjie Han1, Jingsheng Lei2, Zhichao Cao3

    Intelligent Automation & Soft Computing, Vol.30, No.2, pp. 439-452, 2021, DOI:10.32604/iasc.2021.018643

    Abstract Saliency prediction has recently gained a large number of attention for the sake of the rapid development of deep neural networks in computer vision tasks. However, there are still dilemmas that need to be addressed. In this paper, we design a visual saliency prediction model using attention-based cross-model integration strategies in RGB-D images. Unlike other symmetric feature extraction networks, we exploit asymmetric networks to effectively extract depth features as the complementary information of RGB information. Then we propose attention modules to integrate cross-modal feature information and emphasize the feature representation of salient regions, meanwhile neglect the surrounding unimportant pixels, so… More >

  • Open Access


    Person Re-Identification Based on Joint Loss and Multiple Attention Mechanism

    Yong Li, Xipeng Wang*

    Intelligent Automation & Soft Computing, Vol.30, No.2, pp. 563-573, 2021, DOI:10.32604/iasc.2021.017926

    Abstract Person re-identification (ReID) is the use of computer vision and machine learning techniques to determine whether the pedestrians in the two images under different cameras are the same person. It can also be regarded as a matching retrieval task for person targets in different scenes. The research focuses on how to obtain effective person features from images with occlusion, angle change, and target attitude change. Based on the present difficulties and challenges in ReID, the paper proposes a ReID method based on joint loss and multi-attention network. It improves the person re-identification algorithm based on global characteristics, introduces spatial attention… More >

  • Open Access


    AttEF: Convolutional LSTM Encoder-Forecaster with Attention Module for Precipitation Nowcasting

    Wei Fang1,2,*, Lin Pang1, Weinan Yi1, Victor S. Sheng3

    Intelligent Automation & Soft Computing, Vol.30, No.2, pp. 453-466, 2021, DOI:10.32604/iasc.2021.016589

    Abstract Precipitation nowcasting has become an essential technology underlying various public services ranging from weather advisories to citywide rainfall alerts. The main challenge facing many algorithms is the high non-linearity and temporal-spatial complexity of the radar image. Convolutional Long Short-Term Memory (ConvLSTM) is appropriate for modeling spatiotemporal variations as it integrates the convolution operator into recurrent state transition functions. However, the technical characteristic of encoding the input sequence into a fixed-size vector cannot guarantee that ConvLSTM maintains adequate sequence representations in the information flow, which affects the performance of the task. In this paper, we propose Attention ConvLSTM Encoder-Forecaster(AttEF) which allows… More >

  • Open Access


    Global and Graph Encoded Local Discriminative Region Representation for Scene Recognition

    Chaowei Lin1,#, Feifei Lee1,#,*, Jiawei Cai1, Hanqing Chen1, Qiu Chen2,*

    CMES-Computer Modeling in Engineering & Sciences, Vol.128, No.3, pp. 985-1006, 2021, DOI:10.32604/cmes.2021.014522

    Abstract Scene recognition is a fundamental task in computer vision, which generally includes three vital stages, namely feature extraction, feature transformation and classification. Early research mainly focuses on feature extraction, but with the rise of Convolutional Neural Networks (CNNs), more and more feature transformation methods are proposed based on CNN features. In this work, a novel feature transformation algorithm called Graph Encoded Local Discriminative Region Representation (GEDRR) is proposed to find discriminative local representations for scene images and explore the relationship between the discriminative regions. In addition, we propose a method using the multi-head attention module to enhance and fuse convolutional… More >

  • Open Access


    Adaptive Multi-Scale HyperNet with Bi-Direction Residual Attention Module for Scene Text Detection

    Junjie Qu, Jin Liu*, Chao Yu

    Journal of Information Hiding and Privacy Protection, Vol.3, No.2, pp. 83-89, 2021, DOI:10.32604/jihpp.2021.017181

    Abstract Scene text detection is an important step in the scene text reading system. There are still two problems during the existing text detection methods: (1) The small receptive of the convolutional layer in text detection is not sufficiently sensitive to the target area in the image; (2) The deep receptive of the convolutional layer in text detection lose a lot of spatial feature information. Therefore, detecting scene text remains a challenging issue. In this work, we design an effective text detector named Adaptive Multi-Scale HyperNet (AMSHN) to improve texts detection performance. Specifically, AMSHN enhances the sensitivity of target semantics in… More >

  • Open Access


    A Knowledge-Enhanced Dialogue Model Based on Multi-Hop Information with Graph Attention

    Zhongqin Bi1, Shiyang Wang1, Yan Chen2,*, Yongbin Li1, Jung Yoon Kim3,*

    CMES-Computer Modeling in Engineering & Sciences, Vol.128, No.2, pp. 403-426, 2021, DOI:10.32604/cmes.2021.016729

    Abstract With the continuous improvement of the e-commerce ecosystem and the rapid growth of e-commerce data, in the context of the e-commerce ecosystem, consumers ask hundreds of millions of questions every day. In order to improve the timeliness of customer service responses, many systems have begun to use customer service robots to respond to consumer questions, but the current customer service robots tend to respond to specific questions. For many questions that lack background knowledge, they can generate only responses that are biased towards generality and repetitiveness. To better promote the understanding of dialogue and generate more meaningful responses, this paper… More >

Displaying 231-240 on page 24 of 284. Per Page