Visualising Text Co-occurrence Networks

نویسندگان

  • Laurie Hirsch
  • Simon Andrews
چکیده

We present a tool for automatically generating a visual summary of unstructured text data retrieved from documents, web sites or social media feeds. Unlike tools such as word clouds, we are able to visualise structures and topic relationships occurring in a document. These relationships are determined by a unique approach to co-occurrence analysis. The algorithm applies a decaying function to the distance between word pairs found in the original text such that words regularly occurring close to each other score highly, but even words occurring some distance apart will make a small contribution to the overall co-occurrence score. This is in contrast to other algorithms which simply count adjacent words or use a sliding window of fixed size. We show, with examples, how the network generated can be presented in tree or graph format. The tree format allows for the user to interact with the visualisation and expand or contract the data to a preferred level of detail. The tool is available as a web application and can be viewed using any modern web browser.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Initial Comparison of Linguistic Networks Measures for Parallel Texts

In this paper we compared the properties of linguistic networks for Croatian, English and Italian languages. We constructed co-occurrence networks from parallel text corpora, consisting of the translations of five books in the three languages. We generated an Erdös-Rényi random graph with the same number of nodes and links, which enabled the comparison with linguistic co-occurrence networks, sh...

متن کامل

Choosing the Word Most Typical in Context Using a Lexical Co-Occurrence Network

This paper presents a partial solution to a component of the problem of lexical choice: choosing the synonym most typical, or expected, in context. We apply a new statistical approach to representing the context of a word through lexical co-occurrence networks. The implementation was trained and evaluated on a large corpus, and results show that the inclusion of second-order co-occurrence relat...

متن کامل

Deriving a Priori Co-occurrence Probability Estimates for Object Recognition from Social Networks and Text Processing

Certain components in images can be recognized with high accuracy, for example, backgrounds such as leaves, grass, snow, sky, water. These components provide the human eye with context for identifying items in the foreground. Likewise for the machine, the identification of background should help in the recognition of foreground objects. But, in this case, the computer needs explicit lists of ob...

متن کامل

Rapid Understanding of Hot-Keywords of Papers on a given Theme: Based on two-mode Affiliation Network

How to keep up with the tendency of the literature and grasp the key-points of them from the holistic perspective rapidly is a new challenge both for the literature research and text mining. Most of current theories and tools are directed at finding one paper or a small amount of sample, not gaining a rapid understanding of the hot-keywords of all the papers about one given theme or topic. This...

متن کامل

The analysis of co-citation and word co-occurrence networks of Iranian articles in the field of dentistry

Background and Aims: Dentistry is an important profession ensuring the health of body and soul, and has a special place in the scientific productions of medical disciplines. The purpose of this study was to analyze the co-citation and word co-occurrence of Iranian research papers in the field of dentistry based on indexed documents in Web of Science from 2014 to 2018. Materials and Methods:...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016