Journal Menu

Special Issues

Table of Content

Action Recognition and Multimodal Human Behavior Understanding

Submission Deadline: 31 August 2026 View: 222 Submit to Special Issue

Guest Editors

Prof. Elakkiya Rajasekar

Email: elakkiya@dubai.bits-pilani.ac.in

Affiliation: Department of Computer Science, Birla Institute of Technology and Science Pilani, Dubai Campus, Dubai, United Arab Emirates

Homepage:

Research Interests: computer vision, action recognition, video understanding, multimodal learning, sign language recognition/translation, temporal transformers, self-supervised learning

image2 (4).jpeg

Summary

Action recognition and human behavior understanding have become central to modern intelligent systems, driven by the widespread availability of video sensors and the emergence of powerful deep learning models. Recent progress in temporal transformers, self-supervised representation learning, large-scale video pretraining, and multimodal fusion has substantially improved recognition of complex activities in unconstrained environments. However, real-world deployment still faces key challenges including fine-grained action discrimination, cross-domain generalization, robustness under occlusion and viewpoint changes, learning with limited labels, privacy constraints, and efficient inference for edge devices.

This Special Issue aims to collect high-quality research that advances the theory and practice of action recognition and multimodal human behavior understanding. The scope includes novel architectures and learning paradigms for video and temporal modeling, multimodal methods combining video with audio, text, depth, or skeleton streams, and application-driven solutions in smart environments, human–computer interaction, assistive technologies, and security. We particularly encourage submissions addressing generalization, low-resource learning, interpretability, fairness, privacy-preserving pipelines, and compute-efficient deployment.

Suggested themes include (but are not limited to): video transformers and temporal modeling; self-/semi-supervised video learning; multimodal fusion for behavior analysis; skeleton/pose-based action understanding; egocentric and first-person activity recognition; fine-grained and long-term activity modeling; robustness and domain adaptation; privacy-aware action recognition; real-time and edge-efficient inference.

Keywords

action recognition, video understanding, multimodal learning, temporal modeling, video transformers, self-supervised learning, skeleton-based recognition, egocentric vision, human behavior analysis, efficient / edge inference

Show export options

All issues

Online First

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Action Recognition and Multimodal Human Behavior Understanding

Guest Editors

Summary

Keywords

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link