Sparse coding of pathology slides compared to transfer learning with deep neural networks

Background Histopathology images of tumor biopsies present unique challenges for applying machine learning to the diagnosis and treatment of cancer. The pathology slides are high resolution, often exceeding 1GB, have non-uniform dimensions, and often contain multiple tissue slices of varying sizes s...

Full description

Saved in:

Bibliographic Details
Published in	BMC bioinformatics Vol. 19; no. Suppl 18; pp. 489 - 17
Main Authors	Fischer, Will, Moudgalya, Sanketh S., Cohn, Judith D., Nguyen, Nga T. T., Kenyon, Garrett T.
Format	Journal Article
Language	English
Published	London BioMed Central 21.12.2018 BioMed Central Ltd Springer Nature B.V BMC
Subjects	Algorithms Artificial intelligence Artificial neural networks BASIC BIOLOGICAL SCIENCES Biochemistry & Molecular Biology Bioinformatics Biomedical and Life Sciences Biotechnology & Applied Microbiology Cancer Cancer pathology slides Care and treatment Classification Coding Computational Biology/Bioinformatics Computer Appl. in Life Sciences Deep learning Deep Learning - trends Diagnosis Dictionaries Error reduction Feature maps Gene expression Histopathology Humans Image classification Image reconstruction International conferences Learning algorithms Life Sciences Locally Competitive Algorithm Machine learning Mathematical & Computational Biology Medical diagnosis Medical imaging Microarrays Neoplasms - pathology Neural networks Neural Networks (Computer) Pathological histology Pathology Phase transitions Representations Software Sparse coding State of the art TCGA Transfer learning Tumors Unsupervised learning Locally Competitive Algorithm Deep learning Transfer learning Cancer pathology slides Sparse coding TCGA Unsupervised learning
Online Access	Get full text
ISSN	1471-2105 1471-2105
DOI	10.1186/s12859-018-2504-8

Cover

More Information
Summary:	Background Histopathology images of tumor biopsies present unique challenges for applying machine learning to the diagnosis and treatment of cancer. The pathology slides are high resolution, often exceeding 1GB, have non-uniform dimensions, and often contain multiple tissue slices of varying sizes surrounded by large empty regions. The locations of abnormal or cancerous cells, which may constitute a small portion of any given tissue sample, are not annotated. Cancer image datasets are also extremely imbalanced, with most slides being associated with relatively common cancers. Since deep representations trained on natural photographs are unlikely to be optimal for classifying pathology slide images, which have different spectral ranges and spatial structure, we here describe an approach for learning features and inferring representations of cancer pathology slides based on sparse coding. Results We show that conventional transfer learning using a state-of-the-art deep learning architecture pre-trained on ImageNet (RESNET) and fine tuned for a binary tumor/no-tumor classification task achieved between 85 % and 86 % accuracy. However, when all layers up to the last convolutional layer in RESNET are replaced with a single feature map inferred via a sparse coding using a dictionary optimized for sparse reconstruction of unlabeled pathology slides, classification performance improves to over 93 % , corresponding to a 54 % error reduction. Conclusions We conclude that a feature dictionary optimized for biomedical imagery may in general support better classification performance than does conventional transfer learning using a dictionary pre-trained on natural images.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 AC52-06NA25396 USDOE Office of Science (SC)
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-018-2504-8