OuluVS2: A multi-view audiovisual database for non-rigid mouth motion analysis
Visual speech constitutes a large part of our nonrigid facial motion and contains important information that allows machines to interact with human users, for instance, through automatic visual speech recognition (VSR) and speaker verification. One of the major obstacles to research of non-rigid mou...
Saved in:
| Published in | 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG) Vol. 1; pp. 1 - 5 |
|---|---|
| Main Authors | , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
01.05.2015
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.1109/FG.2015.7163155 |
Cover
| Summary: | Visual speech constitutes a large part of our nonrigid facial motion and contains important information that allows machines to interact with human users, for instance, through automatic visual speech recognition (VSR) and speaker verification. One of the major obstacles to research of non-rigid mouth motion analysis is the absence of suitable databases. Those available for public research either lack a sufficient number of speakers or utterances or contain constrained view points, which limits their representativeness and usefulness. This paper introduces a newly collected multi-view audiovisual database for non-rigid mouth motion analysis. It includes more than 50 speakers uttering three types of utterances and more importantly, thousands of videos simultaneously recorded by six cameras from five different views spanned between the frontal and profile views. Moreover, a simple VSR system has been developed and tested on the database to provide some baseline performance. |
|---|---|
| DOI: | 10.1109/FG.2015.7163155 |