DLHub: Model and Data Serving for Science

While the Machine Learning (ML) landscape is evolving rapidly, there has been a relative lag in the development of the "learning systems" needed to enable broad adoption. Furthermore, few such systems are designed to support the specialized requirements of scientific ML. Here we present th...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings - IEEE International Parallel and Distributed Processing Symposium pp. 283 - 292
Main Authors	Chard, Ryan, Li, Zhuozhao, Chard, Kyle, Ward, Logan, Babuji, Yadu, Woodard, Anna, Tuecke, Steven, Blaiszik, Ben, Franklin, Michael J., Foster, Ian
Format	Conference Proceeding
Language	English
Published	IEEE 01.05.2019
Subjects	Adaptation models Biological system modeling Computational modeling DLHub Learning systems Machine learning Metadata Model serving Training
Online Access	Get full text
ISSN	1530-2075
DOI	10.1109/IPDPS.2019.00038

Cover

More Information
Summary:	While the Machine Learning (ML) landscape is evolving rapidly, there has been a relative lag in the development of the "learning systems" needed to enable broad adoption. Furthermore, few such systems are designed to support the specialized requirements of scientific ML. Here we present the Data and Learning Hub for science (DLHub), a multi-tenant system that provides both model repository and serving capabilities with a focus on science applications. DLHub addresses two significant shortcomings in current systems. First, its self-service model repository allows users to share, publish, verify, reproduce, and reuse models, and addresses concerns related to model reproducibility by packaging and distributing models and all constituent components. Second, it implements scalable and low-latency serving capabilities that can leverage parallel and distributed computing resources to democratize access to published models through a simple web interface. Unlike other model serving frameworks, DLHub can store and serve any Python 3-compatible model or processing function, plus multiple-function pipelines. We show that relative to other model serving systems including TensorFlow Serving, SageMaker, and Clipper, DLHub provides greater capabilities, comparable performance without memoization and batching, and significantly better performance when the latter two techniques can be employed. We also describe early uses of DLHub for scientific applications.
ISSN:	1530-2075
DOI:	10.1109/IPDPS.2019.00038