Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (52)
  • Open Access

    ARTICLE

    Robust Human Pose Estimation and Action Recognition Utilizing Feature Extraction

    Sheng Luo1, Rashid Abbasi1,*, Hao Wang2, Jinghua Xu3, Dongyang Lyu4, Aaron Zhang1, Farhan Amin5,*, Isabel de la Torre6, Gerardo Mendez Mezquita7, Henry Fabian Gongora7

    CMES-Computer Modeling in Engineering & Sciences, Vol.146, No.3, 2026, DOI:10.32604/cmes.2026.075080 - 30 March 2026

    Abstract Human pose estimation is crucial across diverse applications, from healthcare to human–computer interaction. Integrating inertial measurement units (IMUs) with monocular vision methods holds great potential for leveraging complementary modalities; however, existing approaches are often limited by IMU drift, noise, and underutilization of visual information. To address these limitations, we propose a novel dual-stream feature extraction framework that effectively combines temporal IMU data and single-view image features for improved pose estimation. Short-term dependencies in IMU sequences are captured with convolutional layers, while a Transformer-based architecture models long-range temporal dynamics. To mitigate IMU drift and inter-sensor inconsistencies, More >

  • Open Access

    ARTICLE

    YOLO-Drive: Robust Driver Distraction Recognition under Fine-Grained and Overlapping Behaviors

    Zhichao Yu1, Jiahui Yu1, Simon James Fong1,*, Yaoyang Wu1,2

    CMC-Computers, Materials & Continua, Vol.87, No.2, 2026, DOI:10.32604/cmc.2025.074899 - 12 March 2026

    Abstract Accurately recognizing driver distraction is critical for preventing traffic accidents, yet current detection models face two persistent challenges. First, distractions are often fine-grained, involving subtle cues such as brief eye closures or partial yawns, which are easily missed by conventional detectors. Second, in real-world scenarios, drivers frequently exhibit overlapping behaviors, such as simultaneously holding a cup, closing their eyes, and yawning, leading to multiple detection boxes and degraded model performance. Existing approaches fail to robustly address these complexities, resulting in limited reliability in safety critical applications. To overcome these pain points, we propose YOLO-Drive, a… More >

  • Open Access

    ARTICLE

    Intelligent Human Interaction Recognition with Multi-Modal Feature Extraction and Bidirectional LSTM

    Muhammad Hamdan Azhar1,2,#, Yanfeng Wu1,#, Nouf Abdullah Almujally3, Shuaa S. Alharbi4, Asaad Algarni5, Ahmad Jalal2,6, Hui Liu1,7,8,*

    CMC-Computers, Materials & Continua, Vol.87, No.1, 2026, DOI:10.32604/cmc.2025.071988 - 10 February 2026

    Abstract Recognizing human interactions in RGB videos is a critical task in computer vision, with applications in video surveillance. Existing deep learning-based architectures have achieved strong results, but are computationally intensive, sensitive to video resolution changes and often fail in crowded scenes. We propose a novel hybrid system that is computationally efficient, robust to degraded video quality and able to filter out irrelevant individuals, making it suitable for real-life use. The system leverages multi-modal handcrafted features for interaction representation and a deep learning classifier for capturing complex dependencies. Using Mask R-CNN and YOLO11-Pose, we extract grayscale… More >

  • Open Access

    ARTICLE

    Action Recognition via Shallow CNNs on Intelligently Selected Motion Data

    Jalees Ur Rahman1, Muhammad Hanif1, Usman Haider2,*, Saeed Mian Qaisar3,*, Sarra Ayouni4

    CMC-Computers, Materials & Continua, Vol.86, No.3, 2026, DOI:10.32604/cmc.2025.071251 - 12 January 2026

    Abstract Deep neural networks have achieved excellent classification results on several computer vision benchmarks. This has led to the popularity of machine learning as a service, where trained algorithms are hosted on the cloud and inference can be obtained on real-world data. In most applications, it is important to compress the vision data due to the enormous bandwidth and memory requirements. Video codecs exploit spatial and temporal correlations to achieve high compression ratios, but they are computationally expensive. This work computes the motion fields between consecutive frames to facilitate the efficient classification of videos. However, contrary… More >

  • Open Access

    ARTICLE

    Enhancing Classroom Behavior Recognition with Lightweight Multi-Scale Feature Fusion

    Chuanchuan Wang1,2, Ahmad Sufril Azlan Mohamed2,*, Xiao Yang 2, Hao Zhang 2, Xiang Li1, Mohd Halim Bin Mohd Noor 2

    CMC-Computers, Materials & Continua, Vol.85, No.1, pp. 855-874, 2025, DOI:10.32604/cmc.2025.066343 - 29 August 2025

    Abstract Classroom behavior recognition is a hot research topic, which plays a vital role in assessing and improving the quality of classroom teaching. However, existing classroom behavior recognition methods have challenges for high recognition accuracy with datasets with problems such as scenes with blurred pictures, and inconsistent objects. To address this challenge, we proposed an effective, lightweight object detector method called the RFNet model (YOLO-FR). The YOLO-FR is a lightweight and effective model. Specifically, for efficient multi-scale feature extraction, effective feature pyramid shared convolutional (FPSC) was designed to improve the feature extract performance by leveraging convolutional… More >

  • Open Access

    ARTICLE

    A YOLOv11-Based Deep Learning Framework for Multi-Class Human Action Recognition

    Nayeemul Islam Nayeem1, Shirin Mahbuba1, Sanjida Islam Disha1, Md Rifat Hossain Buiyan1, Shakila Rahman1,*, M. Abdullah-Al-Wadud2, Jia Uddin3,*

    CMC-Computers, Materials & Continua, Vol.85, No.1, pp. 1541-1557, 2025, DOI:10.32604/cmc.2025.065061 - 29 August 2025

    Abstract Human activity recognition is a significant area of research in artificial intelligence for surveillance, healthcare, sports, and human-computer interaction applications. The article benchmarks the performance of You Only Look Once version 11-based (YOLOv11-based) architecture for multi-class human activity recognition. The article benchmarks the performance of You Only Look Once version 11-based (YOLOv11-based) architecture for multi-class human activity recognition. The dataset consists of 14,186 images across 19 activity classes, from dynamic activities such as running and swimming to static activities such as sitting and sleeping. Preprocessing included resizing all images to 512 512 pixels, annotating them… More >

  • Open Access

    ARTICLE

    A Novel Attention-Based Parallel Blocks Deep Architecture for Human Action Recognition

    Yasir Khan Jadoon1, Yasir Noman Khalid1, Muhammad Attique Khan2, Jungpil Shin3,*, Fatimah Alhayan4, Hee-Chan Cho5, Byoungchol Chang6,*

    CMES-Computer Modeling in Engineering & Sciences, Vol.144, No.1, pp. 1143-1164, 2025, DOI:10.32604/cmes.2025.066984 - 31 July 2025

    Abstract Real-time surveillance is attributed to recognizing the variety of actions performed by humans. Human Action Recognition (HAR) is a technique that recognizes human actions from a video stream. A range of variations in human actions makes it difficult to recognize with considerable accuracy. This paper presents a novel deep neural network architecture called Attention RB-Net for HAR using video frames. The input is provided to the model in the form of video frames. The proposed deep architecture is based on the unique structuring of residual blocks with several filter sizes. Features are extracted from each… More >

  • Open Access

    ARTICLE

    ARNet: Integrating Spatial and Temporal Deep Learning for Robust Action Recognition in Videos

    Hussain Dawood1, Marriam Nawaz2, Tahira Nazir3, Ali Javed2, Abdul Khader Jilani Saudagar4,*, Hatoon S. AlSagri4

    CMES-Computer Modeling in Engineering & Sciences, Vol.144, No.1, pp. 429-459, 2025, DOI:10.32604/cmes.2025.066415 - 31 July 2025

    Abstract Reliable human action recognition (HAR) in video sequences is critical for a wide range of applications, such as security surveillance, healthcare monitoring, and human-computer interaction. Several automated systems have been designed for this purpose; however, existing methods often struggle to effectively integrate spatial and temporal information from input samples such as 2-stream networks or 3D convolutional neural networks (CNNs), which limits their accuracy in discriminating numerous human actions. Therefore, this study introduces a novel deep-learning framework called the ARNet, designed for robust HAR. ARNet consists of two main modules, namely, a refined InceptionResNet-V2-based CNN and… More >

  • Open Access

    ARTICLE

    Prediction of Assembly Intent for Human-Robot Collaboration Based on Video Analytics and Hidden Markov Model

    Jing Qu1, Yanmei Li1,2, Changrong Liu1, Wen Wang1, Weiping Fu1,3,*

    CMC-Computers, Materials & Continua, Vol.84, No.2, pp. 3787-3810, 2025, DOI:10.32604/cmc.2025.065895 - 03 July 2025

    Abstract Despite the gradual transformation of traditional manufacturing by the Human-Robot Collaboration Assembly (HRCA), challenges remain in the robot’s ability to understand and predict human assembly intentions. This study aims to enhance the robot’s comprehension and prediction capabilities of operator assembly intentions by capturing and analyzing operator behavior and movements. We propose a video feature extraction method based on the Temporal Shift Module Network (TSM-ResNet50) to extract spatiotemporal features from assembly videos and differentiate various assembly actions using feature differences between video frames. Furthermore, we construct an action recognition and segmentation model based on the Refined-Multi-Scale… More >

  • Open Access

    ARTICLE

    Video Action Recognition Method Based on Personalized Federated Learning and Spatiotemporal Features

    Rongsen Wu1, Jie Xu1, Yuhang Zhang1, Changming Zhao2,*, Yiweng Xie3, Zelei Wu1, Yunji Li2, Jinhong Guo4, Shiyang Tang5,6

    CMC-Computers, Materials & Continua, Vol.83, No.3, pp. 4961-4978, 2025, DOI:10.32604/cmc.2025.061396 - 19 May 2025

    Abstract With the rapid development of artificial intelligence and Internet of Things technologies, video action recognition technology is widely applied in various scenarios, such as personal life and industrial production. However, while enjoying the convenience brought by this technology, it is crucial to effectively protect the privacy of users’ video data. Therefore, this paper proposes a video action recognition method based on personalized federated learning and spatiotemporal features. Under the framework of federated learning, a video action recognition method leveraging spatiotemporal features is designed. For the local spatiotemporal features of the video, a new differential information… More >

Displaying 1-10 on page 1 of 52. Per Page