Text mapping: Visualising unstructured, structured, and time-based text collections

نویسندگان

  • Vedran Sabol
  • Keith Andrews
  • Wolfgang Kienreich
  • Michael Granitzer
چکیده

Large collections of text documents are increasingly common, both in business and personal information environments. Tools from the field of information visualisation are being used to help users make sense of and extract useful knowledge from such collections. Flat text collections are often visualised using distance calculations between documents and subsequent (distance-preserving) projection. Distance calculations are often based on a vector space of term vectors. Projection is often achieved with a force-directed placement algorithm. Where extra information about a text collection is available, such as a topical hierarchy or some chronological ordering, it can be used to improve a visualisation. This paper gives an overview of text mapping techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Reference-set Approach to Information Extraction from Unstructured, Ungrammatical Data Sources

This thesis investigates information extraction from unstructured, ungrammatical text on the Web such as classified ads, auction listings, and forum postings. Since the data is unstructured and ungrammatical, this information extraction precludes the use of rule-based methods that rely on consistent structures within the text or natural language processing techniques that rely on grammar. Inste...

متن کامل

Mining Association Rules from Unstructured Documents

This paper presents a system for discovering association rules from collections of unstructured documents called EART (Extract Association Rules from Text). The EART system treats texts only not images or figures. EART discovers association rules amongst keywords labeling the collection of textual documents. The main characteristic of EART is that the system integrates XML technology (to transf...

متن کامل

Finding Novel Information in Large, Constantly Incrementing Collections of Structured Data

Project Argus addresses the problem of obtaining novel intelligence from large, constantly incrementing collections of structured data like shipping records, financial transfers, or hospital admission records. Structured data already provides intelligence analysts with a huge amount of important information. The ever-increasing capabilities of techniques to discern structure in currently unstru...

متن کامل

Big Scale Text Analytics and Smart Content Navigation

Identifying and exploring relevant content in growing document collections is a challenge for researchers, users, and system providers alike. Supporting this is crucial for companies offering knowledge in the form of documents as their core product. Our demo shows an intelligent way of doing guided research in big text collections, using the collection of the major scientific publisher Springer...

متن کامل

Keyword-Based Browsing and Analysis of Large Document Sets

Knowledge Discovery in Databases (KDD) focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns within them. While most work on KDD has been concerned with structured databases, there has been little work on handling the huge amount of information that is available only in unstructured textual form. This paper describes the KDT system for Kno...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Intelligent Decision Technologies

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2008