A Novel Stereo Camera Fusion Scheme for Generating and Tracking Real-Time 3-D Patient-Specific Head/Face Kinematics and Facial Muscle Movements
Recovery and rehabilitation of facial mimics need enhanced decision support with multimodal biofeedback from 3-D real-time biomechanical head animation. Kinect V2.0 can detect and track 3-D high-definition (HD) face features (FFs), but the end of production can lead to difficult deployment of the de...
        Saved in:
      
    
          | Published in | IEEE sensors journal Vol. 23; no. 9; pp. 9889 - 9897 | 
|---|---|
| Main Authors | , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        New York
          The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
    
        01.05.2023
     Institute of Electrical and Electronics Engineers  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1530-437X 1558-1748  | 
| DOI | 10.1109/JSEN.2023.3259473 | 
Cover
| Summary: | Recovery and rehabilitation of facial mimics need enhanced decision support with multimodal biofeedback from 3-D real-time biomechanical head animation. Kinect V2.0 can detect and track 3-D high-definition (HD) face features (FFs), but the end of production can lead to difficult deployment of the developed solutions. Deep neural network (DNN)-based methods were employed, but the detected features were in 2-D or not accurate in 3-D. Thus, we developed a novel stereo-fusion scheme for enhancing the accuracy of 3-D features and generating biomechanical heads. Four stereo cameras were employed for detecting 2-D FFs based on DNN-based models. Stereo-triangulated 3-D FFs were fused using the Kalman filter. A head, skull, and muscle network were generated from the fused FFs. We validated the method with 1000 virtual subjects and five computed tomography (CT)-based subjects. The in silico trial errors (mean ± SD) were 2.27 ± 0.29, 3.15 ± 0.23, 1.72 ± 0.13, and 3.08 ± 0.39 mm for the facial head, facial skull, muscle insertion point, and muscle attachment point regions, respectively. The experimental errors were 1.8384 ± 0.1451, 2.6937 ± 0.0575, 1.8271 ± 0.1242, and 3.1428 ± 0.2407 mm. The errors were compatible with those using the Kinect V2.0 sensor and smaller than those using monovision-based 3-D feature detectors. This study has four contributions: 1) a stereo-fusion scheme for reconstructing 3-D FFs from 2-D FFs; 2) an enhancement accuracy for 3-D DNN-based FF detection; 3) a biomechanical head generation from stereo-fusion cameras; and 4) a full validation procedure for 3-D FF detection. The method will be validated with facial palsy patients. Soft-tissue deformation will be integrated with mixed reality technology toward the next generation of face decision support system. | 
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14  | 
| ISSN: | 1530-437X 1558-1748  | 
| DOI: | 10.1109/JSEN.2023.3259473 |