A Novel Stereo Camera Fusion Scheme for Generating and Tracking Real-Time 3-D Patient-Specific Head/Face Kinematics and Facial Muscle Movements

Recovery and rehabilitation of facial mimics need enhanced decision support with multimodal biofeedback from 3-D real-time biomechanical head animation. Kinect V2.0 can detect and track 3-D high-definition (HD) face features (FFs), but the end of production can lead to difficult deployment of the de...

Full description

Saved in:

Bibliographic Details
Published in	IEEE sensors journal Vol. 23; no. 9; pp. 9889 - 9897
Main Authors	Nguyen, Tan-Nhu, Ballit, Abbass, Dao, Tien-Tuan
Format	Journal Article
Language	English
Published	New York The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 01.05.2023 Institute of Electrical and Electronics Engineers
Subjects	Accuracy Animation Artificial neural networks Biofeedback Biomechanics Cameras Computed tomography Decision support systems Errors Head High definition Kalman filters Kinematics Life Sciences Mixed reality Muscles Real time Rehabilitation Skull Three dimensional models
Online Access	Get full text
ISSN	1530-437X 1558-1748
DOI	10.1109/JSEN.2023.3259473

Cover

More Information
Summary:	Recovery and rehabilitation of facial mimics need enhanced decision support with multimodal biofeedback from 3-D real-time biomechanical head animation. Kinect V2.0 can detect and track 3-D high-definition (HD) face features (FFs), but the end of production can lead to difficult deployment of the developed solutions. Deep neural network (DNN)-based methods were employed, but the detected features were in 2-D or not accurate in 3-D. Thus, we developed a novel stereo-fusion scheme for enhancing the accuracy of 3-D features and generating biomechanical heads. Four stereo cameras were employed for detecting 2-D FFs based on DNN-based models. Stereo-triangulated 3-D FFs were fused using the Kalman filter. A head, skull, and muscle network were generated from the fused FFs. We validated the method with 1000 virtual subjects and five computed tomography (CT)-based subjects. The in silico trial errors (mean ± SD) were 2.27 ± 0.29, 3.15 ± 0.23, 1.72 ± 0.13, and 3.08 ± 0.39 mm for the facial head, facial skull, muscle insertion point, and muscle attachment point regions, respectively. The experimental errors were 1.8384 ± 0.1451, 2.6937 ± 0.0575, 1.8271 ± 0.1242, and 3.1428 ± 0.2407 mm. The errors were compatible with those using the Kinect V2.0 sensor and smaller than those using monovision-based 3-D feature detectors. This study has four contributions: 1) a stereo-fusion scheme for reconstructing 3-D FFs from 2-D FFs; 2) an enhancement accuracy for 3-D DNN-based FF detection; 3) a biomechanical head generation from stereo-fusion cameras; and 4) a full validation procedure for 3-D FF detection. The method will be validated with facial palsy patients. Soft-tissue deformation will be integrated with mixed reality technology toward the next generation of face decision support system.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1530-437X 1558-1748
DOI:	10.1109/JSEN.2023.3259473