A Multi-Modal Deep Learning Approach for Emotion Recognition

H. Shahzad; Sohail Bhatti; Arfan Jaffar; Muhammad Rashid

doi:10.32604/iasc.2023.032525

Open Access icon Open Access

ARTICLE

A Multi-Modal Deep Learning Approach for Emotion Recognition

H. M. Shahzad^1,3, Sohail Masood Bhatti^1,3,*, Arfan Jaffar^1,3, Muhammad Rashid²

1 The Superior University, Lahore, Pakistan
2 National University of Technology, Islamabad, Pakistan
3 Intelligent Data Visual Computing Research (IDVCR), Lahore, Pakistan

* Corresponding Author: Sohail Masood Bhatti. Email: email

Intelligent Automation & Soft Computing 2023, 36(2), 1561-1570. https://doi.org/10.32604/iasc.2023.032525

Received 20 May 2022; Accepted 24 June 2022; Issue published 05 January 2023

Abstract

In recent years, research on facial expression recognition (FER) under mask is trending. Wearing a mask for protection from Covid 19 has become a compulsion and it hides the facial expressions that is why FER under the mask is a difficult task. The prevailing unimodal techniques for facial recognition are not up to the mark in terms of good results for the masked face, however, a multimodal technique can be employed to generate better results. We proposed a multimodal methodology based on deep learning for facial recognition under a masked face using facial and vocal expressions. The multimodal has been trained on a facial and vocal dataset. We have used two standard datasets, M-LFW for the masked dataset and CREMA-D and TESS dataset for vocal expressions. The vocal expressions are in the form of audio while the faces data is in image form that is why the data is heterogenous. In order to make the data homogeneous, the voice data is converted into images by taking spectrogram. A spectrogram embeds important features of the voice and it converts the audio format into the images. Later, the dataset is passed to the multimodal for training. neural network and the experimental results demonstrate that the proposed multimodal algorithm outsets unimodal methods and other state-of-the-art deep neural network models.

Keywords

Deep learning; facial expression recognition; multi-model neural network; speech emotion recognition; spectrogram; covid-19

Cite This Article

APA Style

Shahzad, H.M., Bhatti, S.M., Jaffar, A., Rashid, M. (2023). A multi-modal deep learning approach for emotion recognition. Intelligent Automation & Soft Computing, 36(2), 1561-1570. https://doi.org/10.32604/iasc.2023.032525

Vancouver Style

Shahzad HM, Bhatti SM, Jaffar A, Rashid M. A multi-modal deep learning approach for emotion recognition. Intell Automat Soft Comput . 2023;36(2):1561-1570 https://doi.org/10.32604/iasc.2023.032525

IEEE Style

H.M. Shahzad, S.M. Bhatti, A. Jaffar, and M. Rashid "A Multi-Modal Deep Learning Approach for Emotion Recognition," Intell. Automat. Soft Comput. , vol. 36, no. 2, pp. 1561-1570. 2023. https://doi.org/10.32604/iasc.2023.032525

BibTex EndNote RIS

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

A Multi-Modal Deep Learning Approach for Emotion Recognition

Abstract

Keywords

Cite This Article

1744

595

6

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link