Open Access iconOpen Access

ARTICLE

crossmark

Text Extraction with Optimal Bi-LSTM

Bahera H. Nayef1,*, Siti Norul Huda Sheikh Abdullah2, Rossilawati Sulaiman2, Ashwaq Mukred Saeed3

1 Computer Techniques Engineering Department, Ibn Khaldun University College, Baghdad, 10011, Iraq
2 Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, 43600, Malaysia
3 School of Electrical Engineering and Artificial Intelligence, Xiamen University Malaysia, Sepang, 43900, Malaysia

* Corresponding Authors: Bahera H. Nayef. Email: email,email

Computers, Materials & Continua 2023, 76(3), 3549-3567. https://doi.org/10.32604/cmc.2023.039528

Abstract

Text extraction from images using the traditional techniques of image collecting, and pattern recognition using machine learning consume time due to the amount of extracted features from the images. Deep Neural Networks introduce effective solutions to extract text features from images using a few techniques and the ability to train large datasets of images with significant results. This study proposes using Dual Maxpooling and concatenating convolution Neural Networks (CNN) layers with the activation functions Relu and the Optimized Leaky Relu (OLRelu). The proposed method works by dividing the word image into slices that contain characters. Then pass them to deep learning layers to extract feature maps and reform the predicted words. Bidirectional Short Memory (BiLSTM) layers extract more compelling features and link the time sequence from forward and backward directions during the training phase. The Connectionist Temporal Classification (CTC) function calcifies the training and validation loss rates. In addition to decoding the extracted feature to reform characters again and linking them according to their time sequence. The proposed model performance is evaluated using training and validation loss errors on the Mjsynth and Integrated Argument Mining Tasks (IAM) datasets. The result of IAM was 2.09% for the average loss errors with the proposed dual Maxpooling and OLRelu. In the Mjsynth dataset, the best validation loss rate shrunk to 2.2% by applying concatenating CNN layers, and Relu.

Keywords


Cite This Article

B. H. Nayef, S. N. Huda Sheikh Abdullah, R. Sulaiman and A. M. Saeed, "Text extraction with optimal bi-lstm," Computers, Materials & Continua, vol. 76, no.3, pp. 3549–3567, 2023.



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 224

    View

  • 132

    Download

  • 0

    Like

Share Link