Enhancing User Experience in AI-Powered Human-Computer Communication with Vocal Emotions Identification Using a Novel Deep Learning Method

Ahmed Alhussen; Arshiya Ansari; Mohammad Mohammadi

doi:10.32604/cmc.2024.059382

Open Access icon Open Access

ARTICLE

Enhancing User Experience in AI-Powered Human-Computer Communication with Vocal Emotions Identification Using a Novel Deep Learning Method

Ahmed Alhussen¹, Arshiya Sajid Ansari^2,*, Mohammad Sajid Mohammadi³

1 Department of Computer Engineering, College of Computer and Information Sciences, Majmaah University, Al-Majmaah, 11952, Saudi Arabia
2 Department of Information Technology, College of Computer and Information Sciences, Majmaah University, Al-Majmaah, 11952, Saudi Arabia
3 Department of Computer Science, College of Engineering and Information Technology, Onaizah Colleges, Qassim, 51911, Saudi Arabia

* Corresponding Author: Arshiya Sajid Ansari. Email: email

(This article belongs to the Special Issue: Artificial Intelligence Algorithms and Applications)

Computers, Materials & Continua 2025, 82(2), 2909-2929. https://doi.org/10.32604/cmc.2024.059382

Received 06 October 2024; Accepted 06 December 2024; Issue published 17 February 2025

Abstract

Voice, motion, and mimicry are naturalistic control modalities that have replaced text or display-driven control in human-computer communication (HCC). Specifically, the vocals contain a lot of knowledge, revealing details about the speaker’s goals and desires, as well as their internal condition. Certain vocal characteristics reveal the speaker’s mood, intention, and motivation, while word study assists the speaker’s demand to be understood. Voice emotion recognition has become an essential component of modern HCC networks. Integrating findings from the various disciplines involved in identifying vocal emotions is also challenging. Many sound analysis techniques were developed in the past. Learning about the development of artificial intelligence (AI), and especially Deep Learning (DL) technology, research incorporating real data is becoming increasingly common these days. Thus, this research presents a novel selfish herd optimization-tuned long/short-term memory (SHO-LSTM) strategy to identify vocal emotions in human communication. The RAVDESS public dataset is used to train the suggested SHO-LSTM technique. Mel-frequency cepstral coefficient (MFCC) and wiener filter (WF) techniques are used, respectively, to remove noise and extract features from the data. LSTM and SHO are applied to the extracted data to optimize the LSTM network’s parameters for effective emotion recognition. Python Software was used to execute our proposed framework. In the finding assessment phase, Numerous metrics are used to evaluate the proposed model’s detection capability, Such as F1-score (95%), precision (95%), recall (96%), and accuracy (97%). The suggested approach is tested on a Python platform, and the SHO-LSTM’s outcomes are contrasted with those of other previously conducted research. Based on comparative assessments, our suggested approach outperforms the current approaches in vocal emotion recognition.

Keywords

Human-computer communication (HCC); vocal emotions; live vocal; artificial intelligence (AI); deep learning (DL); selfish herd optimization-tuned long/short K term memory (SHO-LSTM)

Cite This Article

APA Style

Alhussen, A., Ansari, A.S., Mohammadi, M.S. (2025). Enhancing User Experience in AI-Powered Human-Computer Communication with Vocal Emotions Identification Using a Novel Deep Learning Method. Computers, Materials & Continua, 82(2), 2909–2929. https://doi.org/10.32604/cmc.2024.059382

Vancouver Style

Alhussen A, Ansari AS, Mohammadi MS. Enhancing User Experience in AI-Powered Human-Computer Communication with Vocal Emotions Identification Using a Novel Deep Learning Method. Comput Mater Contin. 2025;82(2):2909–2929. https://doi.org/10.32604/cmc.2024.059382

IEEE Style

A. Alhussen, A. S. Ansari, and M. S. Mohammadi, “Enhancing User Experience in AI-Powered Human-Computer Communication with Vocal Emotions Identification Using a Novel Deep Learning Method,” Comput. Mater. Contin., vol. 82, no. 2, pp. 2909–2929, 2025. https://doi.org/10.32604/cmc.2024.059382

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Enhancing User Experience in AI-Powered Human-Computer Communication with Vocal Emotions Identification Using a Novel Deep Learning Method

Abstract

Keywords

Cite This Article

1242

1385

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link