Spiral me to the core: Getting a visual grasp on text corpora through clusters and keywords

نویسندگان

  • Maren Scheffel
  • Katja Niemann
  • Sarah Leon Rojas
  • Hendrik Drachsler
  • Marcus Specht
چکیده

The amount of literature within a research domain is ever growing, thus making it difficult to stay on top of everything. Getting a grasp on the important topics of and areas within a domain or even knowing where to start is often tough and tedious. This paper therefore presents a visualization, that is a cluster spiral, that offers a fast but plain and simple way of exploring the content of large text collections.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Notes in Artificial Intelligence 7499 Subseries of Lecture Notes in Computer Science

Corpora are not easy to get a handle on. The usual way of getting to grips with text is to read it, but corpora are mostly too big to read (and not designed to be read). We show, with examples, how keyword lists (of one corpus vs. another) are a direct, practical and fascinating way to explore the characteristics of corpora, and of text types. Our method is to classify the top one hundred keywo...

متن کامل

VerseVis: Visualization of Spoken Features in Poetry

The exploration and analysis of literary corpora is a difficult task. Previous approaches to this problem focused on mining data directly from text. However, these solutions do not aid researchers who are interested in learning spoken features of the text, which play an important role in poetic works. VerseVis is a text visualization tool that gives users the ability to identify interesting tex...

متن کامل

Published vs. Postgraduate Writing in Applied Linguistics: The Case of Lexical Bundles

Abstract: Lexical bundles, as building blocks of coherent discourse, have been the subject of much research in the last two decades. While many of such studies have been mainly concerned with  exploring  variations  in  the  use  of  these  word  sequences  across  different  registers  and disciplines, very few have addressed the use of some particular groups of lexical bundles within some gen...

متن کامل

Using it Bundles in Published and Unpublished Writings

Lexical bundles are known as important elements of coherent discourse that have been the subject of much research. While the previous research has been mainly concerned with exploring variations in the use of these word sequences across different registers and disciplines, very few studies have addressed the use of some particular groups of lexical bundles within some types of academic writing....

متن کامل

Getting to Know Your Corpus

Corpora are not easy to get a handle on. The usual way of getting to grips with text is to read it, but corpora are mostly too big to read (and not designed to be read). We show, with examples, how keyword lists (of one corpus vs. another) are a direct, practical and fascinating way to explore the characteristics of corpora, and of text types. Our method is to classify the top one hundred keywo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014