Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (164)
  • Open Access

    ARTICLE

    Machine-Learning Based Packet Switching Method for Providing Stable High-Quality Video Streaming in Multi-Stream Transmission

    Yumin Jo1, Jongho Paik2,*

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 4153-4176, 2024, DOI:10.32604/cmc.2024.047046

    Abstract Broadcasting gateway equipment generally uses a method of simply switching to a spare input stream when a failure occurs in a main input stream. However, when the transmission environment is unstable, problems such as reduction in the lifespan of equipment due to frequent switching and interruption, delay, and stoppage of services may occur. Therefore, applying a machine learning (ML) method, which is possible to automatically judge and classify network-related service anomaly, and switch multi-input signals without dropping or changing signals by predicting or quickly determining the time of error occurrence for smooth stream switching when there are problems such as… More >

  • Open Access

    ARTICLE

    A Hybrid Machine Learning Approach for Improvised QoE in Video Services over 5G Wireless Networks

    K. B. Ajeyprasaath, P. Vetrivelan*

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 3195-3213, 2024, DOI:10.32604/cmc.2023.046911

    Abstract Video streaming applications have grown considerably in recent years. As a result, this becomes one of the most significant contributors to global internet traffic. According to recent studies, the telecommunications industry loses millions of dollars due to poor video Quality of Experience (QoE) for users. Among the standard proposals for standardizing the quality of video streaming over internet service providers (ISPs) is the Mean Opinion Score (MOS). However, the accurate finding of QoE by MOS is subjective and laborious, and it varies depending on the user. A fully automated data analytics framework is required to reduce the inter-operator variability characteristic… More >

  • Open Access

    ARTICLE

    Video Summarization Approach Based on Binary Robust Invariant Scalable Keypoints and Bisecting K-Means

    Sameh Zarif1,2,*, Eman Morad1, Khalid Amin1, Abdullah Alharbi3, Wail S. Elkilani4, Shouze Tang5

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 3565-3583, 2024, DOI:10.32604/cmc.2024.046185

    Abstract Due to the exponential growth of video data, aided by rapid advancements in multimedia technologies. It became difficult for the user to obtain information from a large video series. The process of providing an abstract of the entire video that includes the most representative frames is known as static video summarization. This method resulted in rapid exploration, indexing, and retrieval of massive video libraries. We propose a framework for static video summary based on a Binary Robust Invariant Scalable Keypoint (BRISK) and bisecting K-means clustering algorithm. The current method effectively recognizes relevant frames using BRISK by extracting keypoints and the… More >

  • Open Access

    REVIEW

    Trends in Event Understanding and Caption Generation/Reconstruction in Dense Video: A Review

    Ekanayake Mudiyanselage Chulabhaya Lankanatha Ekanayake1,2, Abubakar Sulaiman Gezawa3,*, Yunqi Lei1

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 2941-2965, 2024, DOI:10.32604/cmc.2024.046155

    Abstract Video description generates natural language sentences that describe the subject, verb, and objects of the targeted Video. The video description has been used to help visually impaired people to understand the content. It is also playing an essential role in devolving human-robot interaction. The dense video description is more difficult when compared with simple Video captioning because of the object’s interactions and event overlapping. Deep learning is changing the shape of computer vision (CV) technologies and natural language processing (NLP). There are hundreds of deep learning models, datasets, and evaluations that can improve the gaps in current research. This article… More >

  • Open Access

    ARTICLE

    TEAM: Transformer Encoder Attention Module for Video Classification

    Hae Sung Park1, Yong Suk Choi2,*

    Computer Systems Science and Engineering, Vol.48, No.2, pp. 451-477, 2024, DOI:10.32604/csse.2023.043245

    Abstract Much like humans focus solely on object movement to understand actions, directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension. In the recent study, Video Masked Auto-Encoder (VideoMAE) employs a pre-training approach with a high ratio of tube masking and reconstruction, effectively mitigating spatial bias due to temporal redundancy in full video frames. This steers the model’s focus toward detailed temporal contexts. However, as the VideoMAE still relies on full video frames during the action recognition stage, it may exhibit a progressive shift in attention towards spatial contexts, deteriorating its ability… More >

  • Open Access

    ARTICLE

    SwinVid: Enhancing Video Object Detection Using Swin Transformer

    Abdelrahman Maharek1,2,*, Amr Abozeid2,3, Rasha Orban1, Kamal ElDahshan2

    Computer Systems Science and Engineering, Vol.48, No.2, pp. 305-320, 2024, DOI:10.32604/csse.2024.039436

    Abstract What causes object detection in video to be less accurate than it is in still images? Because some video frames have degraded in appearance from fast movement, out-of-focus camera shots, and changes in posture. These reasons have made video object detection (VID) a growing area of research in recent years. Video object detection can be used for various healthcare applications, such as detecting and tracking tumors in medical imaging, monitoring the movement of patients in hospitals and long-term care facilities, and analyzing videos of surgeries to improve technique and training. Additionally, it can be used in telemedicine to help diagnose… More >

  • Open Access

    ARTICLE

    Generative Multi-Modal Mutual Enhancement Video Semantic Communications

    Yuanle Chen1, Haobo Wang1, Chunyu Liu1, Linyi Wang2, Jiaxin Liu1, Wei Wu1,*

    CMES-Computer Modeling in Engineering & Sciences, Vol.139, No.3, pp. 2985-3009, 2024, DOI:10.32604/cmes.2023.046837

    Abstract Recently, there have been significant advancements in the study of semantic communication in single-modal scenarios. However, the ability to process information in multi-modal environments remains limited. Inspired by the research and applications of natural language processing across different modalities, our goal is to accurately extract frame-level semantic information from videos and ultimately transmit high-quality videos. Specifically, we propose a deep learning-based Multi-Modal Mutual Enhancement Video Semantic Communication system, called M3E-VSC. Built upon a Vector Quantized Generative Adversarial Network (VQGAN), our system aims to leverage mutual enhancement among different modalities by using text as the main carrier of transmission. With it,… More >

  • Open Access

    ARTICLE

    CVTD: A Robust Car-Mounted Video Text Detector

    Di Zhou1, Jianxun Zhang1,*, Chao Li2, Yifan Guo1, Bowen Li1

    CMC-Computers, Materials & Continua, Vol.78, No.2, pp. 1821-1842, 2024, DOI:10.32604/cmc.2023.047236

    Abstract Text perception is crucial for understanding the semantics of outdoor scenes, making it a key requirement for building intelligent systems for driver assistance or autonomous driving. Text information in car-mounted videos can assist drivers in making decisions. However, Car-mounted video text images pose challenges such as complex backgrounds, small fonts, and the need for real-time detection. We proposed a robust Car-mounted Video Text Detector (CVTD). It is a lightweight text detection model based on ResNet18 for feature extraction, capable of detecting text in arbitrary shapes. Our model efficiently extracted global text positions through the Coordinate Attention Threshold Activation (CATA) and… More >

  • Open Access

    ARTICLE

    A Video Captioning Method by Semantic Topic-Guided Generation

    Ou Ye, Xinli Wei, Zhenhua Yu*, Yan Fu, Ying Yang

    CMC-Computers, Materials & Continua, Vol.78, No.1, pp. 1071-1093, 2024, DOI:10.32604/cmc.2023.046418

    Abstract In the video captioning methods based on an encoder-decoder, limited visual features are extracted by an encoder, and a natural sentence of the video content is generated using a decoder. However, this kind of method is dependent on a single video input source and few visual labels, and there is a problem with semantic alignment between video contents and generated natural sentences, which are not suitable for accurately comprehending and describing the video contents. To address this issue, this paper proposes a video captioning method by semantic topic-guided generation. First, a 3D convolutional neural network is utilized to extract the… More >

  • Open Access

    ARTICLE

    Improving Video Watermarking through Galois Field GF(24) Multiplication Tables with Diverse Irreducible Polynomials and Adaptive Techniques

    Yasmin Alaa Hassan1,*, Abdul Monem S. Rahma2

    CMC-Computers, Materials & Continua, Vol.78, No.1, pp. 1423-1442, 2024, DOI:10.32604/cmc.2023.046149

    Abstract Video watermarking plays a crucial role in protecting intellectual property rights and ensuring content authenticity. This study delves into the integration of Galois Field (GF) multiplication tables, especially GF(24), and their interaction with distinct irreducible polynomials. The primary aim is to enhance watermarking techniques for achieving imperceptibility, robustness, and efficient execution time. The research employs scene selection and adaptive thresholding techniques to streamline the watermarking process. Scene selection is used strategically to embed watermarks in the most vital frames of the video, while adaptive thresholding methods ensure that the watermarking process adheres to imperceptibility criteria, maintaining the video’s visual quality.… More >

Displaying 1-10 on page 1 of 164. Per Page