Compare Clouds: Visualizing Text Corpora to Compare Media Frames

نویسندگان

  • Nicholas Diakopoulos
  • Dag Elgesem
  • Andrew Salway
  • Amy Zhang
  • Knut Hofland
چکیده

Media frames represent distinct ways of communicating about issues that are reflected in choices of key words and phrases. In this paper we develop a visualization technique and visual analytic system that enables the study of media frames across text corpora. In particular our system allows scholars or other analysts to compare media frames in a visualization called the Compare Cloud, which explicitly maps word prevalence and context information between two corpora. We assess the error profile of the visualization layout and demonstrate the utility of the system by comparing the media discussion between mainstream media and blogs on the topic of surveillance. We report salient observations that the visualization made possible and discuss future challenges related to scalability and effective filtering to support visual frame analysis. Author

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visual Analytics of Media Frames In Online News and Blogs_final

Media frames define different perspectives or ways of communicating about issues and can be manifested through patterns of language use such as key words and their composition. Analytically there is interest in trying to identify new frames around issues, and to compare how types of frames vary across different news outlets, or over time. In this paper we consider these analytic needs in the co...

متن کامل

Learning Verb Subcategorization from Corpora: Counting Frame Subsets

We present some novel machine learning techniques for the identification of subcategorization information for verbs in Czech. We compare three different statistical techniques applied to this problem. We show how the learning algorithm can be used to discover previously unknown subcategorization frames from the Czech Prague Dependency Treebank. The algorithm can then be used to label dependents...

متن کامل

On Measuring the Complexity of Code-Mixing

The paper discusses the practical applicability of a Code-Mixing Index as a measurement of the level of complexity and mixing in texts written in several different languages, and contrasts it to other ways of measuring the complexity of texts. In particular, we describe the application of the proposed Index to corpora of codemixed Indian social media texts and compare their complexity to social...

متن کامل

Speech Recognition and Information Retrieval: Experiments in Retrieving Spoken Documents

The Informedia Digital Video Library Project at Carnegie Mellon University is making large corpora of video and audio data available for full content retrieval by integrating natural language understanding, image processing, speech recognition and information retrieval. Information retrieval of from corpora of speech recognition output is critical to the project’s success. In this paper, we out...

متن کامل

Using Statistical Properties to Enhance Text Categorization

Statistical properties extracted from text are useful in many areas. Knowing who authored some text or knowing the category of a text is among the uses of collecting such statistics. In this paper, language-independent properties of text are studied using two categorized corpora of news articles. It is observed that the properties do not depend on the corpus nor on its size. Several interesting...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015