A Novel Stereo Camera Fusion Scheme for Generating and Tracking Real-Time 3-D Patient-Specific Head/Face Kinematics and Facial Muscle Movements

Recovery and rehabilitation of facial mimics need enhanced decision support with multimodal biofeedback from 3-D real-time biomechanical head animation. Kinect V2.0 can detect and track 3-D high-definition (HD) face features (FFs), but the end of production can lead to difficult deployment of the de...

Full description

Saved in:
Bibliographic Details
Published inIEEE sensors journal Vol. 23; no. 9; pp. 9889 - 9897
Main Authors Nguyen, Tan-Nhu, Ballit, Abbass, Dao, Tien-Tuan
Format Journal Article
LanguageEnglish
Published New York The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 01.05.2023
Institute of Electrical and Electronics Engineers
Subjects
Online AccessGet full text
ISSN1530-437X
1558-1748
DOI10.1109/JSEN.2023.3259473

Cover

More Information
Summary:Recovery and rehabilitation of facial mimics need enhanced decision support with multimodal biofeedback from 3-D real-time biomechanical head animation. Kinect V2.0 can detect and track 3-D high-definition (HD) face features (FFs), but the end of production can lead to difficult deployment of the developed solutions. Deep neural network (DNN)-based methods were employed, but the detected features were in 2-D or not accurate in 3-D. Thus, we developed a novel stereo-fusion scheme for enhancing the accuracy of 3-D features and generating biomechanical heads. Four stereo cameras were employed for detecting 2-D FFs based on DNN-based models. Stereo-triangulated 3-D FFs were fused using the Kalman filter. A head, skull, and muscle network were generated from the fused FFs. We validated the method with 1000 virtual subjects and five computed tomography (CT)-based subjects. The in silico trial errors (mean ± SD) were 2.27 ± 0.29, 3.15 ± 0.23, 1.72 ± 0.13, and 3.08 ± 0.39 mm for the facial head, facial skull, muscle insertion point, and muscle attachment point regions, respectively. The experimental errors were 1.8384 ± 0.1451, 2.6937 ± 0.0575, 1.8271 ± 0.1242, and 3.1428 ± 0.2407 mm. The errors were compatible with those using the Kinect V2.0 sensor and smaller than those using monovision-based 3-D feature detectors. This study has four contributions: 1) a stereo-fusion scheme for reconstructing 3-D FFs from 2-D FFs; 2) an enhancement accuracy for 3-D DNN-based FF detection; 3) a biomechanical head generation from stereo-fusion cameras; and 4) a full validation procedure for 3-D FF detection. The method will be validated with facial palsy patients. Soft-tissue deformation will be integrated with mixed reality technology toward the next generation of face decision support system.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1530-437X
1558-1748
DOI:10.1109/JSEN.2023.3259473