Aligning Coordinated Text Streams through Burst Information Network Construction and Decipherment
نویسندگان
چکیده
Aligning coordinated text streams from multiple sources and multiple languages has opened many new research venues on cross-lingual knowledge discovery. In this paper we aim to advance state-of-the-art by: (1). extending coarse-grained topic-level knowledge mining to fine-grained information units such as entities and events; (2). following a novel “Datato-Network-to-Knowledge (D2N2K)” paradigm to construct and utilize network structures to capture and propagate reliable evidence. We introduce a novel Burst Information Network (BINet) representation that can display the most important information and illustrate the connections among bursty entities, events and keywords in the corpus. We propose an effective approach to construct and decipher BINets, incorporating novel criteria based on multi-dimensional clues from pronunciation, translation, burst, neighbor and graph topological structure. The experimental results on Chinese and English coordinated text streams show that our approach can accurately decipher the nodes with high confidence in the BINets and that the algorithm can be efficiently run in parallel, which makes it possible to apply it to huge amounts of streaming data for never-ending language and information decipherment.
منابع مشابه
Cost Effective Heat Exchanger Network Design with Mixed Materials of Construction
This paper presents a simple methodology for cost estimation of a near optimal heat exchanger network, which comprises mixed materials of construction. Intraditional pinch technology and mathematical programming it is usually assumed that all heat exchangers in a network obey a single cost model. This implies that all heat exchangers in a network are of the same type and use the same mate...
متن کاملNews Stream Summarization using Burst Information Networks
This paper studies summarizing key information from news streams. We propose simple yet effective models to solve the problem based on a novel and promising representation of text streams – Burst Information Networks (BINets). A BINet can be aware of redundant information, allows global analysis of a text stream, and can be efficiently built and dynamically updated, which perfectly fits the dem...
متن کاملEnhancing TCP to Improve Throughput of HTTP Adaptive Streaming
With increasing video service consumption through IP network, HTTP Adaptive Streaming (HAS) technology has become popular. The chunked transmission and application-layer adaption create a different traffic pattern than traditional progressive download where the entire video is downloaded with a single request. Its start and stop activity cause burst and retransmission timeout (RTO) which result...
متن کاملAligning Bim Adoption with Implementation in Loosely Coupled Construction Systems
Building Information Modelling (BIM) is considered an innovation for construction, with the potential to digitise various construction processes. Being an innovation, it affects and is affected by organisational aspects. At the same time, innovations are better observed at a project level. This study connects intraand interorganisational levels mobilised during BIM implementation. To explore th...
متن کاملBlogScope: A System for Online Analysis of High Volume Text Streams
We present BlogScope (www.blogscope.net), a system for online analysis of temporally ordered streaming text, currently applied to the analysis of the Blogosphere. The system currently tracks over ten million blogs and handles hundreds of thousands of updates daily. BlogScope is an information discovery and text analysis system that offers a set of unique features. Such features include, spatio-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1609.08237 شماره
صفحات -
تاریخ انتشار 2016