Finding Similar Research Papers using Language Models

نویسندگان

  • Germán Hurtado Martín
  • Steven Schockaert
  • Chris Cornelis
  • Helga Naessens
چکیده

The task of assessing the similarity of research papers is of interest in a variety of application contexts. It is a challenging task, however, as the full text of the papers is often not available, and similarity needs to be determined based on the papers’ abstract, and some additional features such as authors, keywords, and journal. Our work explores the possibility of adapting language modeling techniques to this end. The basic strategy we pursue is to augment the information contained in the abstract by interpolating the corresponding language model with language models for the authors, keywords and journal of the paper. This strategy is then extended by finding topics and additionally interpolating with the resulting topic models. These topics are found using an adaptation of Latent Dirichlet Allocation (LDA), in which the keywords that were provided by the authors are used to guide the process.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Publication Venue Based Language Modeling for Expert Finding

Expert finding is hot topic discussed in co-author networks. Traditional language models only compares query terms with the documents of candidate for expert finding and ignores venues (conferences or journals) in which the paper is published. In this paper we propose novel influence language models which consider the importance of venues in which the papers of candidates are published. If the ...

متن کامل

کاربست مدل‌ بازیابی تخصص برای یافتن نویسندگان خبره

This research applied Expertise Retrieval model for finding expert authors, and used evaluation methods of Information Retrieval systems for measuring the performance of those models. Current research is an experimental one. Besides, a variety of methods including survey method has been used in the research process. Various models were developed for finding expert authors, all built on a known ...

متن کامل

Gender-preferential Linguistic Elements in Applied Linguistics Research Papers: Partial Evaluation of a Model of Gendered Language

This article intended to investigate whether the gender-preferential linguistic elements found by Argomon, Koppel, Fine and Shimoni (2003) show the same gender-linked frequencies in applied linguistics research papers written by non-native speakers of English. In so doing, a sample of 32 articles from different journals was collected and the proportion of the targeted features to the whole numb...

متن کامل

Quality appraisal of published qualitative dental, medical and health ‎researches‏ ‏in Iranian Persian language journals

BACKGROUND AND AIM: This study aimed to determine the rate of published qualitative research in the field of public health including dental researches in Iran and to appraise their quality.METHODS: A total of 165 articles which published in 170 Iranian Medical Journals between years 2000 and 2014 were found eligible to the study. 48 papers were selected randomly. The papers were appraised by tw...

متن کامل

Evaluating researches on urban housing indicators in current decade based on PRISMA method

Considering important role of housing in contemporary urban areas, evaluating urban housing quality has become one of the most popular topics in recent researches. Housing has vast conceptual perspectives which include many aspects of urban life beside the dwelling purpose of it, such as recreation, primary schools, and play yards and so on. The most efficient tool for achieving such purposes i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011