Finding Needles in the Haystack: Harnessing Syslogs for Data Center Management
نویسندگان
چکیده
Network device syslogs are ubiquitous and abundant in modern data centers with most large data centers producing millions of messages per day. Yet, the operational information reflected in syslogs and their implications on diagnosis or management tasks are poorly understood. Prevalent approaches to understanding syslogs focus on simple correlation and abnormality detection and are often limited to detection providing little insight towards diagnosis and resolution. Towards improving data center operations, we propose and implement Log-Prophet, a system that applies a toolbox of statistical techniques and domain specific models to mine detailed diagnoses. Log-Prophet infers causal relationships between syslog lines and constructs succinct but valuable problem graphs, summarizing root causes and their locality, including cascading problems. We validate Log-Prophet using problem tickets and through operator interviews. To demonstrate the strength of Log-Prophet, we perform an initial longitudinal study of a large online service provider’s data center. Our study demonstrates that Log-Prophet significantly reduces the number of alerts while highlighting interesting operational issues.
منابع مشابه
Guest Editors' Introduction: Information Discovery--Needles and Haystacks
For thousands of years, people have realized the importance of archiving and finding information. With the advent of computers, it became possible to store large amounts of information in electronic form — and finding useful needles in the resulting haystacks has since become one of the most important problems in information management. Many systems exist to help users navigate the considerable...
متن کاملThe Needles-in-Haystack Problem
We consider a new data mining problem of detecting the members of a rare class of data, the needles, that have been hidden in a set of records, the haystack. Besides the haystack, a single instance of a needle is given. It is assumed that members of the needle class are similar according to an unknown needle characterization. The goal is to find the needle records hidden in the haystack. This p...
متن کاملFinding the epistasis needles in the genome-wide haystack.
Genome-wide association studies (GWAS) have dominated the field of human genetics for the past 10 years. This study design allows for an unbiased, dense exploration of the genome and provides researchers with a vast array of SNPs to look for association with their trait or disease of interest. GWAS has been referred to as finding needles in a haystack and while many of these "needles," or SNPs ...
متن کاملThe haystack is made of needles.
Developing genetic tests that have clinical utility and validated biomarkers presents many challenges. Much has been written about these challenges for the development of genetic test evidence (Khoury et al., 2010; Horn and Terry, 2012) and biomarker validation (Lesko and Atkinson, 2001; Surh, 2009). One consistent thread through these challenges is the lack of well-characterized cohorts. This ...
متن کاملProcess rather than pattern: finding pine needles in the coevolutionary haystack
The geographic mosaic theory is fast becoming a unifying framework for coevolutionary studies. A recent experimental study of interactions between pines and mycorrhizal fungi in BMC Biology is the first to rigorously test geographical selection mosaics, one of the cornerstones of the theory.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1605.06150 شماره
صفحات -
تاریخ انتشار 2016