Real-time expression transfer for facial reenactment

We present a method for the real-time transfer of facial expressions from an actor in a source video to an actor in a target video, thus enabling the ad-hoc control of the facial expressions of the target actor. The novelty of our approach lies in the transfer and photorealistic re-rendering of faci...

Full description

Saved in:
Bibliographic Details
Published inACM transactions on graphics Vol. 34; no. 6; pp. 1 - 14
Main Authors Thies, Justus, Zollhöfer, Michael, Nießner, Matthias, Valgaerts, Levi, Stamminger, Marc, Theobalt, Christian
Format Journal Article
LanguageEnglish
Published New York, NY, USA ACM 01.11.2015
Subjects
Online AccessGet full text
ISSN0730-0301
1557-7368
DOI10.1145/2816795.2818056

Cover

More Information
Summary:We present a method for the real-time transfer of facial expressions from an actor in a source video to an actor in a target video, thus enabling the ad-hoc control of the facial expressions of the target actor. The novelty of our approach lies in the transfer and photorealistic re-rendering of facial deformations and detail into the target video in a way that the newly-synthesized expressions are virtually indistinguishable from a real video. To achieve this, we accurately capture the facial performances of the source and target subjects in real-time using a commodity RGB-D sensor. For each frame, we jointly fit a parametric model for identity, expression, and skin reflectance to the input color and depth data, and also reconstruct the scene lighting. For expression transfer, we compute the difference between the source and target expressions in parameter space, and modify the target parameters to match the source expressions. A major challenge is the convincing re-rendering of the synthesized target face into the corresponding video stream. This requires a careful consideration of the lighting and shading design, which both must correspond to the real-world environment. We demonstrate our method in a live setup, where we modify a video conference feed such that the facial expressions of a different person (e.g., translator) are matched in real-time.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0730-0301
1557-7368
DOI:10.1145/2816795.2818056