Modeling with Structured Priors for Text - Driven Science by Michael J . Paul

نویسندگان

  • Michael J. Paul
  • Mark Dredze
  • Jason Eisner
  • Eric Horvitz
  • Hanna Wallach
چکیده

Many scientific disciplines are being revolutionized by the explosion of public data on the web and social media, particularly in health and social sciences. For instance, by analyzing social media messages, we can instantly measure public opinion, understand population behaviors, and monitor events such as disease outbreaks and natural disasters. Taking advantage of these data sources requires tools that can make sense of massive amounts of unstructured and unlabeled text. Topic models, statistical models that posit low-dimensional representations of data, can uncover interesting latent structure in large text datasets and are popular tools for automatically identifying prominent themes in text. For example, prominent themes of discussion in social media might include politics and health. To be useful in scientific analyses, topic models must learn interpretable patterns that accurately correspond to real-world concepts of interest. This thesis will introduce topic models that can encode additional structures such as factorizations, hierarchies, and correlations of topics, and can incorporate supervision and domain knowledge. For example, topics about elections and Congressional legislation are related to each other (as part of a

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SPRITE: Generalizing Topic Models with Structured Priors

We introduce SPRITE, a family of topic models that incorporates structure into model priors as a function of underlying components. The structured priors can be constrained to model topic hierarchies, factorizations, correlations, and supervision, allowing SPRITE to be tailored to particular settings. We demonstrate this flexibility by constructing a SPRITE-based model to jointly infer topic hi...

متن کامل

Factorial LDA: Sparse Multi-Dimensional Text Models

Latent variable models can be enriched with a multi-dimensional structure to consider the many latent factors in a text corpus, such as topic, author perspective and sentiment. We introduce factorial LDA, a multi-dimensional model in which a document is influenced by K different factors, and each word token depends on a K-dimensional vector of latent variables. Our model incorporates structured...

متن کامل

Drug Extraction from the Web: Summarizing Drug Experiences with Multi-Dimensional Topic Models

Multi-dimensional latent text models, such as factorial LDA (f-LDA), capture multiple factors of corpora, creating structured output for researchers to better understand the contents of a corpus. We consider such models for clinical research of new recreational drugs and trends, an important application for mining current information for healthcare workers. We use a “three-dimensional” f-LDA va...

متن کامل

Experimenting with Drugs (and Topic Models): Multi-Dimensional Exploration of Recreational Drug Discussions

Clinical research of new recreational drugs and trends requires mining current information from non-traditional text sources. In this work we support such research through the use of a multi-dimensional latent text model – factorial LDA – that captures orthogonal factors of corpora, creating structured output for researchers to better understand the contents of a corpus. Since a purely unsuperv...

متن کامل

SceneSuggest: Context-driven 3D Scene Design

We present SCENESUGGEST: an interactive 3D scene design system providing context-driven suggestions for 3D model retrieval and placement. Using a point-and-click metaphor we specify regions in a scene in which to automatically place and orient relevant 3D models. Candidate models are ranked using a set of static support, position, and orientation priors learned from 3D scenes. We show that our ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015