Open Access iconOpen Access



Dynamic Audio-Visual Biometric Fusion for Person Recognition

Najlaa Hindi Alsaedi*, Emad Sami Jaha

Department of Computer Science, Faculty of Computer Science and Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia

* Corresponding Author: Najlaa Hindi Alsaedi. Email: email

Computers, Materials & Continua 2022, 71(1), 1283-1311.


Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities, such as face, voice, fingerprint, gait, etc. Such biometric modalities are mostly used in recognition tasks separately as in unimodal systems, or jointly with two or more as in multimodal systems. However, multimodal systems can usually enhance the recognition performance over unimodal systems by integrating the biometric data of multiple modalities at different fusion levels. Despite this enhancement, in real-life applications some factors degrade multimodal systems’ performance, such as occlusion, face poses, and noise in voice data. In this paper, we propose two algorithms that effectively apply dynamic fusion at feature level based on the data quality of multimodal biometrics. The proposed algorithms attempt to minimize the negative influence of confusing and low-quality features by either exclusion or weight reduction to achieve better recognition performance. The proposed dynamic fusion was achieved using face and voice biometrics, where face features were extracted using principal component analysis (PCA), and Gabor filters separately, whilst voice features were extracted using Mel-Frequency Cepstral Coefficients (MFCCs). Here, the facial data quality assessment of face images is mainly based on the existence of occlusion, whereas the assessment of voice data quality is substantially based on the calculation of signal to noise ratio (SNR) as per the existence of noise. To evaluate the performance of the proposed algorithms, several experiments were conducted using two combinations of three different databases, AR database, and the extended Yale Face Database B for face images, in addition to VOiCES database for voice data. The obtained results show that both proposed dynamic fusion algorithms attain improved performance and offer more advantages in identification and verification over not only the standard unimodal algorithms but also the multimodal algorithms using standard fusion methods.


Cite This Article

APA Style
Alsaedi, N.H., Jaha, E.S. (2022). Dynamic audio-visual biometric fusion for person recognition. Computers, Materials & Continua, 71(1), 1283-1311.
Vancouver Style
Alsaedi NH, Jaha ES. Dynamic audio-visual biometric fusion for person recognition. Comput Mater Contin. 2022;71(1):1283-1311
IEEE Style
N.H. Alsaedi and E.S. Jaha, "Dynamic Audio-Visual Biometric Fusion for Person Recognition," Comput. Mater. Contin., vol. 71, no. 1, pp. 1283-1311. 2022.

cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1616


  • 1073


  • 0


Share Link