Dynamic monitoring of high-performance distributed applications

Developers and users of high-performance distributed systems often observe performance problems such as unexpectedly low throughput or high latency. Determining the source of the performance problems requires detailed end-to-end instrumentation of all components, including the applications, operatin...

Full description

Saved in:
Bibliographic Details
Published in11th International Symposium on High-Performance Distributed Computing (HPDC-11 2002) pp. 163 - 170
Main Authors Gunter, D., Tierney, B., Jackson, K., Lee, J., Stoufer, M.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.01.2002
Subjects
Online AccessGet full text
ISBN0769516866
9780769516868
ISSN1082-8907
DOI10.1109/HPDC.2002.1029915

Cover

More Information
Summary:Developers and users of high-performance distributed systems often observe performance problems such as unexpectedly low throughput or high latency. Determining the source of the performance problems requires detailed end-to-end instrumentation of all components, including the applications, operating systems, hosts, and networks. However, one must be very careful to design the instrumentation to have extremely low overhead, and not affect the system being monitored. In this paper we present a very light-weight instrumentation system that can be dynamically activated to unobtrusively collect and aggregate detailed end-to-end monitoring information from distributed applications. We also show how emerging "web services" can be used to facilitate remote interaction with this system.
Bibliography:SourceType-Conference Papers & Proceedings-1
ObjectType-Conference Paper-1
content type line 25
ISBN:0769516866
9780769516868
ISSN:1082-8907
DOI:10.1109/HPDC.2002.1029915