Open Access iconOpen Access



Low Complexity Encoder with Multilabel Classification and Image Captioning Model

Mahmoud Ragab1,2,3,*, Abdullah Addas4

1 Information Technology Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
2 Centre of Artificial Intelligence for Precision Medicines, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
3 Mathematics Department, Faculty of Science, Al-Azhar University, Naser City, 11884, Cairo, Egypt
4 Landscape Architecture Department, Faculty of Architecture & Planning, King Abdulaziz University, Jeddah, 21589, Saudi Arabia

* Corresponding Author: Mahmoud Ragab. Email: email

Computers, Materials & Continua 2022, 72(3), 4323-4337.


Due to the advanced development in the multimedia-on-demand traffic in different forms of audio, video, and images, has extremely moved on the vision of the Internet of Things (IoT) from scalar to Internet of Multimedia Things (IoMT). Since Unmanned Aerial Vehicles (UAVs) generates a massive quantity of the multimedia data, it becomes a part of IoMT, which are commonly employed in diverse application areas, especially for capturing remote sensing (RS) images. At the same time, the interpretation of the captured RS image also plays a crucial issue, which can be addressed by the multi-label classification and Computational Linguistics based image captioning techniques. To achieve this, this paper presents an efficient low complexity encoding technique with multi-label classification and image captioning for UAV based RS images. The presented model primarily involves the low complexity encoder using the Neighborhood Correlation Sequence (NCS) with a burrows wheeler transform (BWT) technique called LCE-BWT for encoding the RS images captured by the UAV. The application of NCS greatly reduces the computation complexity and requires fewer resources for image transmission. Secondly, deep learning (DL) based shallow convolutional neural network for RS image classification (SCNN-RSIC) technique is presented to determine the multiple class labels of the RS image, shows the novelty of the work. Finally, the Computational Linguistics based Bidirectional Encoder Representations from Transformers (BERT) technique is applied for image captioning, to provide a proficient textual description of the RS image. The performance of the presented technique is tested using the UCM dataset. The simulation outcome implied that the presented model has obtained effective compression performance, reconstructed image quality, classification results, and image captioning outcome.


Cite This Article

M. Ragab and A. Addas, "Low complexity encoder with multilabel classification and image captioning model," Computers, Materials & Continua, vol. 72, no.3, pp. 4323–4337, 2022.

cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 996


  • 658


  • 0


Share Link