Multi-View Multi-Modal Head-Gaze Estimation for Advanced Indoor User Interaction

Jung-Hwa Kim; Jin-Woo Jeong

doi:10.32604/cmc.2022.021107

Open Access icon Open Access

ARTICLE

Multi-View Multi-Modal Head-Gaze Estimation for Advanced Indoor User Interaction

Jung-Hwa Kim¹, Jin-Woo Jeong^2,*

1 Department of Computer Engineering, Kumoh National Institute of Technology, Gumi, 39177, Korea
2 Department of Data Science, Seoul National University of Science and Technology, Seoul, 01811, Korea

* Corresponding Author: Jin-Woo Jeong. Email: email

Computers, Materials & Continua 2022, 70(3), 5107-5132. https://doi.org/10.32604/cmc.2022.021107

Received 23 June 2021; Accepted 07 August 2021; Issue published 11 October 2021

Abstract

Gaze estimation is one of the most promising technologies for supporting indoor monitoring and interaction systems. However, previous gaze estimation techniques generally work only in a controlled laboratory environment because they require a number of high-resolution eye images. This makes them unsuitable for welfare and healthcare facilities with the following challenging characteristics: 1) users’ continuous movements, 2) various lighting conditions, and 3) a limited amount of available data. To address these issues, we introduce a multi-view multi-modal head-gaze estimation system that translates the user’s head orientation into the gaze direction. The proposed system captures the user using multiple cameras with depth and infrared modalities to train more robust gaze estimators under the aforementioned conditions. To this end, we implemented a deep learning pipeline that can handle different types and combinations of data. The proposed system was evaluated using the data collected from 10 volunteer participants to analyze how the use of single/multiple cameras and modalities affect the performance of head-gaze estimators. Through various experiments, we found that 1) an infrared-modality provides more useful features than a depth-modality, 2) multi-view multi-modal approaches provide better accuracy than single-view single-modal approaches, and 3) the proposed estimators achieve a high inference efficiency that can be used in real-time applications.

Keywords

Human-computer interaction; deep learning; head-gaze estimation; indoor monitoring

Cite This Article

APA Style

Kim, J., Jeong, J. (2022). Multi-View Multi-Modal Head-Gaze Estimation for Advanced Indoor User Interaction. Computers, Materials & Continua, 70(3), 5107–5132. https://doi.org/10.32604/cmc.2022.021107

Vancouver Style

Kim J, Jeong J. Multi-View Multi-Modal Head-Gaze Estimation for Advanced Indoor User Interaction. Comput Mater Contin. 2022;70(3):5107–5132. https://doi.org/10.32604/cmc.2022.021107

IEEE Style

J. Kim and J. Jeong, “Multi-View Multi-Modal Head-Gaze Estimation for Advanced Indoor User Interaction,” Comput. Mater. Contin., vol. 70, no. 3, pp. 5107–5132, 2022. https://doi.org/10.32604/cmc.2022.021107

BibTex EndNote RIS

Citations

1

[click to view]

Copyright © 2022 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Multi-View Multi-Modal Head-Gaze Estimation for Advanced Indoor User Interaction

Abstract

Keywords

Cite This Article

Citations

2426

2153

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link