Mining NANOG Mailing List∗

نویسندگان

  • Tony Zu-Cheng Huang
  • Chi-Yao Hong
چکیده

It had been shown that the misbehaviors by few malicious, compromised or misconfigured BGP routers could lead to serious outages in Internet. This failing becomes progressively crucial as the recent prosper of outage-sensitive applications such as Voice over IP, streaming media, and video conferencing. To address these misbehaviors, previous work mainly focus on distributedly detect or prevent outages using limited state of Internet. In this paper, we present a first step towards efficient troubleshooting by mining network operators mailing lists. Using Natural Language Processing (NLP) and Machine Learning, we develop a new approach to extract useful information from the mailing forum on North American Network Operators Group (NANOG). Our experimental results show that the proposed approach detects 94 out of 105 outages from NANOG with a false positive rate of only 7.3%. We validate the extracted outage using real network logs collected by Route Views project. While our approach is not perfectly accurate, we envision it to be a useful information to existing anomaly detection/prevension mechanisms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Data Mining for Understanding Leadership Behavior

We propose an approach for understanding leadership behavior in dot-jp, a non-profit organization, by analyzing heterogeneous multi-data composed of questionnaires and mailing list archives. Attitudes toward leaders were obtained from the questionnaires, and human networks were extracted from the mailing list archives. By integrating the results, we discovered that leaders must receive messages...

متن کامل

Authorship Identification for Heterogeneous Documents

The study of authorship identification in Japanese has for the most part been restricted to literary texts using basic statistical methods. In the present study, authors of mailing list messages are identified using a machine learning technique (Support Vector Machines). In addition, the classifier trained on the mailing list data is applied to identify the author of Web documents in order to i...

متن کامل

A Tool for Identifying Swarm Intelligence on a Free/open Source Software Mailing List

A software tool designed using the concepts of swarm intelligence and text mining is proposed as an aid in the analysis of free/open source software (FOSS) development communities. A prototype of the tool collects textual data from an electronic mailing list, a primary mode of FOSS developer communication. The tool enables a user to compare patterns of discussion topics found in the text with p...

متن کامل

Internet Outages, the Eyewitness Accounts: Analysis of the Outages Mailing List

Understanding network reliability and outages is critical to the “health” of the Internet infrastructure. Unfortunately, our ability to analyze Internet outages has been hampered by the lack of access to public information from key players. In this paper, we leverage a somewhat unconventional dataset to analyze Internet reliability—the outages mailing list. The mailing list is an avenue for net...

متن کامل

Exploring the Music Library Association Mailing List: A Text Mining Approach

Music librarians and people pursuing music librarianship have exchanged emails via the Music Library Association Mailing List (MLA-L) for decades. The list archive is an invaluable resource to discover new insights on music information retrieval from the perspective of the music librarian community. This study analyzes a corpus of 53,648 emails posted on MLA-L from 2000 to 2016 by using text mi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009