서로 다른 색상 공간을 가지는 스테레오 이미지에서 마스크를 착용한 사람들에 대한 3차원 공간 정보 획득

본 논문에서는 별도의 신뢰할 수 있는 깊이 정보 없이 동시에 RGB 이미지와 적외선(IR) 이미지 획득이 가능한 스테레오 카메라 시스템을 이용해서 마스크를 착용한 상태로 화면에 주어진 시각 정보를 응시하는 사람들의 모습을 촬영한 대규모 데이터세트에서 피실험자들의 눈과 얼굴 각 지점에 대한 3차원 정보를 추출하는 방법을 논의한다. 본 논문에서 다루는 카메라 시스템은 두 카메라의 색상 공간이 다르기 때문에 널리 사용되는 이미지 프로세싱 알고리즘들을 바로 적용할 수 없으며 마스크 착용으로 인한 여러 가지 한계도 발생한다. 따라서 본 연구...

Full description

Saved in:

Bibliographic Details
Published in	디지털콘텐츠학회논문지 Vol. 23; no. 12; pp. 2527 - 2536
Main Author	박태정(Taejung Park)
Format	Journal Article
Language	Korean
Published	한국디지털콘텐츠학회 01.12.2022
Subjects	컴퓨터학 Kinect 키넥트 시선추적 인공지능 데이터세트 AI-Hub Gaze Estimation 스테레오 카메라 Artificial Intelligence Datasets Stereo Camera
Online Access	Get full text
ISSN	1598-2009 2287-738X
DOI	10.9728/dcs.2022.23.12.2527

Cover

More Information
Summary:	본 논문에서는 별도의 신뢰할 수 있는 깊이 정보 없이 동시에 RGB 이미지와 적외선(IR) 이미지 획득이 가능한 스테레오 카메라 시스템을 이용해서 마스크를 착용한 상태로 화면에 주어진 시각 정보를 응시하는 사람들의 모습을 촬영한 대규모 데이터세트에서 피실험자들의 눈과 얼굴 각 지점에 대한 3차원 정보를 추출하는 방법을 논의한다. 본 논문에서 다루는 카메라 시스템은 두 카메라의 색상 공간이 다르기 때문에 널리 사용되는 이미지 프로세싱 알고리즘들을 바로 적용할 수 없으며 마스크 착용으로 인한 여러 가지 한계도 발생한다. 따라서 본 연구에서는 얼굴의 랜드마크(landmark)을 포착할 수 있는 알고리즘을 적용하여 두 이미지 간의 일치점을 파악하였다. 또한 랜드마크에 대해 3차원 지점을 근사하기 위한 알고리즘도 제안한다. 약 250,000 쌍에 달하는 데이터들을 처리하기 위해서 CPU 기반 병렬처리를 수행하였으며 그 결과 인공지능 기반 시선 추적 알고리즘을 학습시키기에 충분한 수준인 93.95%에 달하는 데이터에서 3차원 기하 정보를 추출할 수 있었다. This paper presents a solution to calculate positions in three-dimensional space for a stereo camera system that can acquire RGB and infrared (IR) images at the same time without reliable depth information, for a large-scale image of people gazing at the specific positions on the screen while wearing a mask. In the camera system dealt with in this paper, since the color spaces of the two cameras are different, widely used image processing algorithms cannot be directly applied, and various limitations occur due to wearing a mask. To address this, an algorithm that can capture a specific point of the face was applied to identify the coincidence point between the two images. This paper also proposes an algorithm for approximating three-dimensional points. CPU-based parallel processing has been performed to process about 250,000 pairs of data to get three-dimensional geometric information with success rate of 93.95%, which is sufficient to train an AI-based eye tracking algorithm. KCI Citation Count: 2
Bibliography:	http://dx.doi.org/10.9728/dcs.2022.23.12.2527
ISSN:	1598-2009 2287-738X
DOI:	10.9728/dcs.2022.23.12.2527