VISA: A Supervised Approach to Indexing Video Lectures with Semantic Annotations

Many universities adopt educational systems where the teacher lecture is video recorded and the video lecture is made available to students with minimum post-processing effort. These cost-effective solutions suffer from the limited amount of annotations associated with the video content, which stron...

Full description

Saved in:
Bibliographic Details
Published in2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC) Vol. 1; pp. 226 - 235
Main Authors Cagliero, Luca, Canale, Lorenzo, Farinetti, Laura
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2019
Subjects
Online AccessGet full text
ISBN9781728126074
172812607X
DOI10.1109/COMPSAC.2019.00041

Cover

More Information
Summary:Many universities adopt educational systems where the teacher lecture is video recorded and the video lecture is made available to students with minimum post-processing effort. These cost-effective solutions suffer from the limited amount of annotations associated with the video content, which strongly limits the usability of the service when students need to retrieve specific portions of video, e.g., to revise unclear aspects covered in the past lectures. This paper presents, as a real case study, the system developed and implemented in our university for video lecture annotation and indexing. The original video recordings, which last around 1.5 hour, are first partitioned into smaller segments and then annotated by mapping their content with the entities in a multilingual knowledge base. To this purpose, the proposed approach analyzes both the transcription of the teacher's speech and the text appearing in the video (e.g., the slide content, the note written on the whiteboard) by means of an ad hoc Named Entity Recognition and Disambiguation (NERD) step. NERD relies on a supervised classification approach tailored to the domain under analysis. More specifically, to identify the most salient entities of the knowledge base matching the video content it considers not only text similarity measures but also the semantic pertinence of the candidate entities to the main subject of the video lectures. The performance of the proposed system was validated on a ground truth against the techniques available in the general entity annotation system GERBIL. The preliminary results demonstrate the effectiveness of the proposed approach.
ISBN:9781728126074
172812607X
DOI:10.1109/COMPSAC.2019.00041