Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (8)
  • Open Access

    ARTICLE

    Time and Space Efficient Multi-Model Convolution Vision Transformer for Tomato Disease Detection from Leaf Images with Varied Backgrounds

    Ankita Gangwar1, Vijaypal Singh Dhaka1, Geeta Rani2,*, Shrey Khandelwal1, Ester Zumpano3,4, Eugenio Vocaturo3,4

    CMC-Computers, Materials & Continua, Vol.79, No.1, pp. 117-142, 2024, DOI:10.32604/cmc.2024.048119

    Abstract A consumption of 46.9 million tons of processed tomatoes was reported in 2022 which is merely 20% of the total consumption. An increase of 3.3% in consumption is predicted from 2024 to 2032. Tomatoes are also rich in iron, potassium, antioxidant lycopene, vitamins A, C and K which are important for preventing cancer, and maintaining blood pressure and glucose levels. Thus, tomatoes are globally important due to their widespread usage and nutritional value. To face the high demand for tomatoes, it is mandatory to investigate the causes of crop loss and minimize them. Diseases are one of the major causes… More >

  • Open Access

    ARTICLE

    TEAM: Transformer Encoder Attention Module for Video Classification

    Hae Sung Park1, Yong Suk Choi2,*

    Computer Systems Science and Engineering, Vol.48, No.2, pp. 451-477, 2024, DOI:10.32604/csse.2023.043245

    Abstract Much like humans focus solely on object movement to understand actions, directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension. In the recent study, Video Masked Auto-Encoder (VideoMAE) employs a pre-training approach with a high ratio of tube masking and reconstruction, effectively mitigating spatial bias due to temporal redundancy in full video frames. This steers the model’s focus toward detailed temporal contexts. However, as the VideoMAE still relies on full video frames during the action recognition stage, it may exhibit a progressive shift in attention towards spatial contexts, deteriorating its ability… More >

  • Open Access

    ARTICLE

    SwinVid: Enhancing Video Object Detection Using Swin Transformer

    Abdelrahman Maharek1,2,*, Amr Abozeid2,3, Rasha Orban1, Kamal ElDahshan2

    Computer Systems Science and Engineering, Vol.48, No.2, pp. 305-320, 2024, DOI:10.32604/csse.2024.039436

    Abstract What causes object detection in video to be less accurate than it is in still images? Because some video frames have degraded in appearance from fast movement, out-of-focus camera shots, and changes in posture. These reasons have made video object detection (VID) a growing area of research in recent years. Video object detection can be used for various healthcare applications, such as detecting and tracking tumors in medical imaging, monitoring the movement of patients in hospitals and long-term care facilities, and analyzing videos of surgeries to improve technique and training. Additionally, it can be used in telemedicine to help diagnose… More >

  • Open Access

    ARTICLE

    Mapping of Land Use and Land Cover (LULC) Using EuroSAT and Transfer Learning

    Suman Kunwar1,*, Jannatul Ferdush2

    Revue Internationale de Géomatique, Vol.33, pp. 1-13, 2024, DOI:10.32604/rig.2023.047627

    Abstract As the global population continues to expand, the demand for natural resources increases. Unfortunately, human activities account for 23% of greenhouse gas emissions. On a positive note, remote sensing technologies have emerged as a valuable tool in managing our environment. These technologies allow us to monitor land use, plan urban areas, and drive advancements in areas such as agriculture, climate change mitigation, disaster recovery, and environmental monitoring. Recent advances in Artificial Intelligence (AI), computer vision, and earth observation data have enabled unprecedented accuracy in land use mapping. By using transfer learning and fine-tuning with red-green-blue (RGB) bands, we achieved an… More > Graphic Abstract

    Mapping of Land Use and Land Cover (LULC) Using EuroSAT and Transfer Learning

  • Open Access

    ARTICLE

    Single Image Desnow Based on Vision Transformer and Conditional Generative Adversarial Network for Internet of Vehicles

    Bingcai Wei, Di Wang, Zhuang Wang, Liye Zhang*

    CMES-Computer Modeling in Engineering & Sciences, Vol.137, No.2, pp. 1975-1988, 2023, DOI:10.32604/cmes.2023.027727

    Abstract With the increasing popularity of artificial intelligence applications, machine learning is also playing an increasingly important role in the Internet of Things (IoT) and the Internet of Vehicles (IoV). As an essential part of the IoV, smart transportation relies heavily on information obtained from images. However, inclement weather, such as snowy weather, negatively impacts the process and can hinder the regular operation of imaging equipment and the acquisition of conventional image information. Not only that, but the snow also makes intelligent transportation systems make the wrong judgment of road conditions and the entire system of the Internet of Vehicles adverse.… More > Graphic Abstract

    Single Image Desnow Based on Vision Transformer and Conditional Generative Adversarial Network for Internet of Vehicles

  • Open Access

    ARTICLE

    ViT2CMH: Vision Transformer Cross-Modal Hashing for Fine-Grained Vision-Text Retrieval

    Mingyong Li, Qiqi Li, Zheng Jiang, Yan Ma*

    Computer Systems Science and Engineering, Vol.46, No.2, pp. 1401-1414, 2023, DOI:10.32604/csse.2023.034757

    Abstract In recent years, the development of deep learning has further improved hash retrieval technology. Most of the existing hashing methods currently use Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to process image and text information, respectively. This makes images or texts subject to local constraints, and inherent label matching cannot capture fine-grained information, often leading to suboptimal results. Driven by the development of the transformer model, we propose a framework called ViT2CMH mainly based on the Vision Transformer to handle deep Cross-modal Hashing tasks rather than CNNs or RNNs. Specifically, we use a BERT network to extract text… More >

  • Open Access

    ARTICLE

    Explainable Anomaly Detection Using Vision Transformer Based SVDD

    Ji-Won Baek1, Kyungyong Chung2,*

    CMC-Computers, Materials & Continua, Vol.74, No.3, pp. 6573-6586, 2023, DOI:10.32604/cmc.2023.035246

    Abstract Explainable AI extracts a variety of patterns of data in the learning process and draws hidden information through the discovery of semantic relationships. It is possible to offer the explainable basis of decision-making for inference results. Through the causality of risk factors that have an ambiguous association in big medical data, it is possible to increase transparency and reliability of explainable decision-making that helps to diagnose disease status. In addition, the technique makes it possible to accurately predict disease risk for anomaly detection. Vision transformer for anomaly detection from image data makes classification through MLP. Unfortunately, in MLP, a vector… More >

  • Open Access

    ARTICLE

    Efficient Image Captioning Based on Vision Transformer Models

    Samar Elbedwehy1,*, T. Medhat2, Taher Hamza3, Mohammed F. Alrahmawy3

    CMC-Computers, Materials & Continua, Vol.73, No.1, pp. 1483-1500, 2022, DOI:10.32604/cmc.2022.029313

    Abstract Image captioning is an emerging field in machine learning. It refers to the ability to automatically generate a syntactically and semantically meaningful sentence that describes the content of an image. Image captioning requires a complex machine learning process as it involves two sub models: a vision sub-model for extracting object features and a language sub-model that use the extracted features to generate meaningful captions. Attention-based vision transformers models have a great impact in vision field recently. In this paper, we studied the effect of using the vision transformers on the image captioning process by evaluating the use of four different… More >

Displaying 1-10 on page 1 of 8. Per Page