Open Access
ARTICLE
CloudViT: A Lightweight Ground-Based Cloud Image Classification Model with the Ability to Capture Global Features
1 National Key Laboratory of Intelligent Spatial Information, Beijing, 100029, China
2 School of Artificial Intelligence, Neijiang Normal University, Neijiang, 641100, China
3 CMA Cloud-Precipitation Physics and Weather Modification Key Laboratory, Beijing, 100081, China
4 Gansu Weather Modification Office, Lanzhou, 730020, China
5 School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China
6 Department of Epidemiology and Biostatistics, School of Public Health, University at Albany, State University of New York, New York, NY 12144, USA
* Corresponding Authors: Dequan Li. Email: ; Jinrong Hu. Email:
Computers, Materials & Continua 2025, 83(3), 5729-5746. https://doi.org/10.32604/cmc.2025.061402
Received 23 November 2024; Accepted 20 March 2025; Issue published 19 May 2025
Abstract
Accurate cloud classification plays a crucial role in aviation safety, climate monitoring, and localized weather forecasting. Current research has been focusing on machine learning techniques, particularly deep learning based model, for the types identification. However, traditional approaches such as convolutional neural networks (CNNs) encounter difficulties in capturing global contextual information. In addition, they are computationally expensive, which restricts their usability in resource-limited environments. To tackle these issues, we present the Cloud Vision Transformer (CloudViT), a lightweight model that integrates CNNs with Transformers. The integration enables an effective balance between local and global feature extraction. To be specific, CloudViT comprises two innovative modules: Feature Extraction (E_Module) and Downsampling (D_Module). These modules are able to significantly reduce the number of model parameters and computational complexity while maintaining translation invariance and enhancing contextual comprehension. Overall, the CloudViT includes 0.93 × 106 parameters, which decreases more than ten times compared to the SOTA (State-of-the-Art) model CloudNet. Comprehensive evaluations conducted on the HBMCD and SWIMCAT datasets showcase the outstanding performance of CloudViT. It achieves classification accuracies of 98.45% and 100%, respectively. Moreover, the efficiency and scalability of CloudViT make it an ideal candidate for deployment in mobile cloud observation systems, enabling real-time cloud image classification. The proposed hybrid architecture of CloudViT offers a promising approach for advancing ground-based cloud image classification. It holds significant potential for both optimizing performance and facilitating practical deployment scenarios.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.