Principled work÷ow-centric tracing of distributed systems

نویسندگان

  • Raja R. Sambasivan
  • Ilari Shafer
  • Jonathan Mace
  • Benjamin H. Sigelman
  • Rodrigo Fonseca
  • Gregory R. Ganger
چکیده

Workžow-centric tracing captures the workžow of causallyrelated events (e.g., work done to process a request) within and among the components of a distributed system. As distributed systems grow in scale and complexity, such tracing is becoming a critical tool for understanding distributed system behavior. Yet, there is a fundamental lack of clarity about how such infrastructures should be designed to provide maximum benet for important management tasks, such as resource accounting and diagnosis. Without research into this important issue, there is a danger that workžow-centric tracing will not reach its full potential. To help, this paper distills the design space of workžow-centric tracing and describes key design choices that can help or hinder a tracing infrastructure’s utility for important tasks. Our design space and the design choices we suggest are based on our experiences developing several previous workžow-centric tracing infrastructures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diagnosing performance changes in distributed systems by comparing request flows

Diagnosing performance problems in modern datacenters and distributed systems is challenging, as the root cause could be contained in any one of the system’s numerous components or, worse, could be a result of interactions among them. As distributed systems continue to increase in complexity, diagnosis tasks will only become more challenging. ere is a need for a new class of diagnosis technique...

متن کامل

So, youwant to trace your distributed system? Key design insights from years of practical experience

End-to-end tracing captures the workžow of causally-related activity (e.g., work done to process a request) within and among the components of a distributed system. As distributed systems grow in scale and complexity, such tracing is becoming a critical tool for management tasks like diagnosis and resource accounting. Drawing upon our experiences building and using end-to-end tracing infrastruc...

متن کامل

Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems pdfauthor=Jonathan Mace, Ryan Roelke, Rodrigo Fonseca

Monitoring and troubleshooting distributed systems is notoriously diõcult; potential problems are complex, varied, and unpredictable. _emonitoring and diagnosis tools commonly used today – logs, counters, andmetrics – have two important limitations: what gets recorded is deûned a priori, and the information is recorded in a componentor machine-centric way,making it extremely hard to correlate e...

متن کامل

Access control in ultra-large-scale systems using a data-centric middleware

  The primary characteristic of an Ultra-Large-Scale (ULS) system is ultra-large size on any related dimension. A ULS system is generally considered as a system-of-systems with heterogeneous nodes and autonomous domains. As the size of a system-of-systems grows, and interoperability demand between sub-systems is increased, achieving more scalable and dynamic access control system becomes an im...

متن کامل

A Virtual Filesystem Framework to Support Embedded Software Development

A VIRTUAL FILESYSTEM FRAMEWORK TO SUPPORT EMBEDDED SOFTWARE DEVELOPMENT We present an approach to simplify the software development process for embedded systems by supporting key development tasks such as debugging, tracing and configuration. The approach is based on the use of distributed filesystem abstractions; principal building blocks within an embedded system in the form of “systems on ch...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016