How Many Ground Truths Should We Insert? Having Good Quality of Labeling Tasks in Crowdsourcing

Having a lot of labels of good quality by crowd sourcing has attracted considerable interest recently. Ground truths can be helpful to this end, but prior work does not adequately address how many ground truths should be used. This paper presents a method for determining the number of ground truths....

Full description

Saved in:

Bibliographic Details
Published in	Proceedings - International Computer Software & Applications Conference Vol. 2; pp. 796 - 805
Main Authors	Kubota, Takuya, Aritsugi, Masayoshi
Format	Conference Proceeding Journal Article
Language	English Japanese
Published	IEEE 01.07.2015
Subjects	Algorithms Applications programs Computational complexity Computer programs Condorcet Jury Theorem Crowdsourcing EM algorithm Estimation Ground truth human computation Labeling Labels Machine learning algorithms Mathematical model Proposals Software Sourcing Tasks
Online Access	Get full text
ISSN	0730-3157
DOI	10.1109/COMPSAC.2015.117

Cover

More Information
Summary:	Having a lot of labels of good quality by crowd sourcing has attracted considerable interest recently. Ground truths can be helpful to this end, but prior work does not adequately address how many ground truths should be used. This paper presents a method for determining the number of ground truths. The number is determined by iteratively calculating the expected quality of labels if a ground truth is inserted into labeling tasks and comparing it with the limit of estimation quality of labels expectedly obtained by crowd sourcing. Our method can be applied to general EM algorithm-based approaches to estimating consensus labels of good quality. We compare our method with an EM algorithm-based approach, which is adopted to our method in the discussions of this paper, in terms of both efficiency of collecting labels from crowd and quality of labels obtained from the collected ones.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2
ISSN:	0730-3157
DOI:	10.1109/COMPSAC.2015.117