Lesion-Harvester: Iteratively Mining Unlabeled Lesions and Hard-Negative Examples at Scale

The acquisition of large-scale medical image data, necessary for training machine learning algorithms, is hampered by associated expert-driven annotation costs. Mining hospital archives can address this problem, but labels often incomplete or noisy, e.g ., 50% of the lesions in DeepLesion are left u...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on medical imaging Vol. 40; no. 1; pp. 59 - 70
Main Authors Cai, Jinzheng, Harrison, Adam P., Zheng, Youjing, Yan, Ke, Huo, Yuankai, Xiao, Jing, Yang, Lin, Lu, Le
Format Journal Article
LanguageEnglish
Published United States IEEE 01.01.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0278-0062
1558-254X
1558-254X
DOI10.1109/TMI.2020.3022034

Cover

More Information
Summary:The acquisition of large-scale medical image data, necessary for training machine learning algorithms, is hampered by associated expert-driven annotation costs. Mining hospital archives can address this problem, but labels often incomplete or noisy, e.g ., 50% of the lesions in DeepLesion are left unlabeled. Thus, effective label harvesting methods are critical. This is the goal of our work, where we introduce Lesion-Harvester-a powerful system to harvest missing annotations from lesion datasets at high precision. Accepting the need for some degree of expert labor, we use a small fully-labeled image subset to intelligently mine annotations from the remainder. To do this, we chain together a highly sensitive lesion proposal generator (LPG) and a very selective lesion proposal classifier (LPC). Using a new hard negative suppression loss, the resulting harvested and hard-negative proposals are then employed to iteratively finetune our LPG. While our framework is generic, we optimize our performance by proposing a new 3D contextual LPG and by using a global-local multi-view LPC. Experiments on DeepLesion demonstrate that Lesion-Harvester can discover an additional 9,805 lesions at a precision of 90%. We publicly release the harvested lesions, along with a new test set of completely annotated DeepLesion volumes. We also present a pseudo 3D IoU evaluation metric that corresponds much better to the real 3D IoU than current DeepLesion evaluation metrics. To quantify the downstream benefits of Lesion-Harvester we show that augmenting the DeepLesion annotations with our harvested lesions allows state-of-the-art detectors to boost their average precision by 7 to 10%.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0278-0062
1558-254X
1558-254X
DOI:10.1109/TMI.2020.3022034