Lesion-Harvester: Iteratively Mining Unlabeled Lesions and Hard-Negative Examples at Scale

The acquisition of large-scale medical image data, necessary for training machine learning algorithms, is hampered by associated expert-driven annotation costs. Mining hospital archives can address this problem, but labels often incomplete or noisy, e.g ., 50% of the lesions in DeepLesion are left u...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on medical imaging Vol. 40; no. 1; pp. 59 - 70
Main Authors	Cai, Jinzheng, Harrison, Adam P., Zheng, Youjing, Yan, Ke, Huo, Yuankai, Xiao, Jing, Yang, Lin, Lu, Le
Format	Journal Article
Language	English
Published	United States IEEE 01.01.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Annotations Biomedical imaging Computed tomography Detectors hard negative mining Harvesters Harvesting Image acquisition Learning algorithms lesion detection Lesion harvesting Lesions Machine Learning Medical imaging Proposals pseudo 3D IoU Three-dimensional displays Training
Online Access	Get full text
ISSN	0278-0062 1558-254X 1558-254X
DOI	10.1109/TMI.2020.3022034

Cover

More Information
Summary:	The acquisition of large-scale medical image data, necessary for training machine learning algorithms, is hampered by associated expert-driven annotation costs. Mining hospital archives can address this problem, but labels often incomplete or noisy, e.g ., 50% of the lesions in DeepLesion are left unlabeled. Thus, effective label harvesting methods are critical. This is the goal of our work, where we introduce Lesion-Harvester-a powerful system to harvest missing annotations from lesion datasets at high precision. Accepting the need for some degree of expert labor, we use a small fully-labeled image subset to intelligently mine annotations from the remainder. To do this, we chain together a highly sensitive lesion proposal generator (LPG) and a very selective lesion proposal classifier (LPC). Using a new hard negative suppression loss, the resulting harvested and hard-negative proposals are then employed to iteratively finetune our LPG. While our framework is generic, we optimize our performance by proposing a new 3D contextual LPG and by using a global-local multi-view LPC. Experiments on DeepLesion demonstrate that Lesion-Harvester can discover an additional 9,805 lesions at a precision of 90%. We publicly release the harvested lesions, along with a new test set of completely annotated DeepLesion volumes. We also present a pseudo 3D IoU evaluation metric that corresponds much better to the real 3D IoU than current DeepLesion evaluation metrics. To quantify the downstream benefits of Lesion-Harvester we show that augmenting the DeepLesion annotations with our harvested lesions allows state-of-the-art detectors to boost their average precision by 7 to 10%.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0278-0062 1558-254X 1558-254X
DOI:	10.1109/TMI.2020.3022034