Who Leads Whom: Topical Lead-Lag Analysis across Corpora

نویسندگان

  • Xiaolin Shi
  • Ramesh Nallapati
  • Jure Leskovec
  • Dan McFarland
  • Dan Jurafsky
چکیده

In this work, we study the problem of whether grant proposals lead academic publications in terms of generation of scientific ideas. This is an important computational social sciene question that can help us understand the dynamics of scientific innovation. We propose simple but scalable techniques for lead/lag estimation, based on LDA and time series analysis, that work on any unlabeled textual corpora with temporal information. We perform our analysis on about half a million Computer Science research paper abstracts and 20,000 successful NSF grant proposal abstracts that represent the entire field of Computer Science in the time span of 1991-2008. Our analysis, besides revealing interesting patterns, finds that the lead/lag of scientific papers with respect to grant proposals is highly topic specific.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-dataset Clustering: Revealing Corresponding Themes across Multiple Corpora

We present a method for identifying corresponding themes across several corpora that are focused on related, but distinct, domains. This task is approached through simultaneous clustering of keyword sets extracted from the analyzed corpora. Our algorithm extends the informationbottleneck soft clustering method for a suitable setting consisting of several datasets. Experimentation with topical c...

متن کامل

Newspapers vs. Blogs: Who Gets the Scoop?

Blogs and formal news sources both monitor the events of the day, but with substantially different frames of reference. In this paper, we report on experiments comparing over 500,000 blog postings with the contents of 66 daily newspapers over the same six week period. We compare the prevalence of popular topics in the blogspace and news, and in particular analyze lead/lag relationships in frequ...

متن کامل

LeadLag LDA: Estimating Topic Specific Leads and Lags of Information Outlets

Identifying which outlet in social media leads the rest in disseminating novel information on specific topics is an interesting challenge for information analysts and social scientists. In this work, we hypothesize that novel ideas are disseminated through the creation and propagation of new or newly emphasized key words, and therefore lead/lag of outlets can be estimated by tracking word usage...

متن کامل

A Linear Programming Approach for Calculation of All Stabilizing Parameters of Lead/lag Controllers

Lead/Lag controllers are used extensively in industry and there is no straight forward and general solution to the problem of calculating all stabilizing parameters of Lead/Lag controllers. In this paper, a linear programming approach is proposed to calculate all stabilizing parameters of Lead/Lag type controllers for a given continuous-time plant from an arbitrary order. The proposed method is...

متن کامل

Lagging Behind – The Emerging Influence of Jet Lag Symptoms on Road Safety

Road traffic accidents are the leading cause of death in international travelers. With the growth of international travel, the number of visitors who rent a vehicle upon arrival at their destination by air or by sea is expected to increase. Jet lag is a well-recognized maladaptation to international travel across multiple time zones. Little is known about the possible influence of jet lag sympt...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010