Editorial for the Second Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics (CLBib2017)

نویسندگان

  • Iana Atanassova
  • Marc Bertin
  • Philipp Mayr
چکیده

The Open Access movement in scientific publishing and search engines like Google Scholar have made scientific articles more broadly accessible. During the last decade, the availability of scientific papers in full text has become more and more widespread thanks to the growing number of publications on online platforms such as ArXiv, CiteSeer and Public Library of Science (PLOS). In this context, new needs arise around the processing and efficient exploitation of scientific corpora. Scientific papers are highly structured texts and display specific properties related to their references but also argumentative and rhetorical structure. Recent research in this field has concentrated on the construction of ontologies for citations and scientific articles (e.g. FaBiO and CiTO [8]) and studies of the distribution of references (see [2]). However, up to now full-text mining efforts are rarely used to provide data for bibliometric analyses. While bibliometrics traditionally relies on the analysis of metadata of scientific papers (see e.g. a recent special issue on ”Combining Bibliometrics and Information Retrieval”, Mayr & Scharnhorst [6]), we will explore the ways full-text processing of scientific papers and linguistic analyses can play. The CLBib workshop series provides a forum to discuss novel approaches and insights into scientific writing that can bring new perspectives to understand both the nature of citations and the nature of scientific articles. The possibility to enrich metadata by the full-text processing of papers offers new fields of application to bibliometrics studies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Editorial for the First Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics

The open access movement in scientific publishing and search engines like Google Scholar has made scientific articles more broadly accessible. During the last decade, the availability of scientific papers in full text has become more and more widespread thanks to the growing number of publications on online platforms such as ArXiv and CiteSeer [1]. The efforts to provide articles in machine-rea...

متن کامل

Editorial for the 7th Bibliometric-enhanced Information Retrieval Workshop at ECIR 2018

The Bibliometric-enhanced Information Retrieval (BIR) workshop series has started at ECIR in 2014 and serves as the annual gathering of IR researchers who address various information-related tasks on scientific corpora and bibliometrics. We welcome contributions elaborating on dedicated IR systems, as well as studies revealing original characteristics on how scientific knowledge is created, com...

متن کامل

Technical structure of the global nanoscience and nanotechnology literature

Text mining was used to extract technical intelligence from the open source global nanotechnology and nanoscience research literature. An extensive nanotechnology/nanosciencefocused query was applied to the Science Citation Index/Social Science Citation Index (SCI/SSCI) databases. The nanotechnology/nanoscience research literature technical structure (taxonomy) was obtained using computational ...

متن کامل

The hidden structure of neuropsychology: text mining of the journal Cortex: 1991--2001.

BACKGROUND The stated mission of Cortex is "the study of the inter-relations of the nervous system and behavior, particularly as these are reflected in the effects of brain lesions on cognitive functions." The purpose of this paper is to explore the relationship between the stated mission and the executed mission as reflected by the characteristics of papers published in Cortex. In addition, we...

متن کامل

Mining Scientific Terms and their Definitions: A Study of the ACL Anthology

This paper presents DefMiner, a supervised sequence labeling system that identifies scientific terms and their accompanying definitions. DefMiner achieves 85% F1 on a Wikipedia benchmark corpus, significantly improving the previous state-of-the-art by 8%. We exploit DefMiner to process the ACL Anthology Reference Corpus (ARC) – a large, real-world digital library of scientific articles in compu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017