Invited Talk: Domain-adaptation of Natural Language Processing Tools for RE

نویسنده

  • Tejaswini Deoskar
چکیده

Natural language processing tools like part-of-speech taggers and parsers are being used in a variety of applications involving natural language, including RE. Such tools, based on statistical models of language, are learnt via supervised machine learning algorithms from human-annotated data. Due to their dependence on annotated data, which is limited in size and genre, these models have a fall in performance for words or constructions not encountered in the annotated data, as well as for genres or domains of language different from the supervised training data. This talk will present Tejaswini Deoskar’s work on semi-supervised learning, where a model initially trained on supervised data is further improved by using unannotated data, available in much larger quantities. Such semi-supervised training improves performance over low-frequency words and constructions, i.e. those in the long tail of language use, and may also be used to adapt supervised NLP models to perform better over new domains of text such as those used in RE documents.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EFL Classroom Discourse in Iranian Context: Investigating Teacher Talk Adaptation to Students’ Proficiency Level

How language teachers talk is a key factor in organizing and facilitating learning specifically in language classrooms where the medium of instruction is also the subject matter. This study aimed to examine the extent and ways of teacher talk adaptation to students’ proficiency levels in the Iranian EFL context. Two EFL teachers who were teaching three different proficiency levels were observed...

متن کامل

Keynote: Evaluation of NLP Tools for Hairy RE Tasks

Natural language processing (NLP) has been used since the 1980s to construct tools for performing natural language (NL) requirements engineering (RE) tasks. While these NL RE tasks are not inherently difficult for humans, on the scale of the collection of NL artifacts for the development of a typical large-scale computer-based system (CBS), these tasks become unmanageable, i.e., hairy. Because ...

متن کامل

Proceedings of the 10 th European Workshop on Natural Language Generation ( ENLG - 05 )

Probabilistic finite-state methods have been very successful for natural language processing (NLP) problems like tagging, entity identification, and transliteration. These methods have also been packaged in very useful software toolkits. However, they are not so good for attacking problems with large-scale reordering (translation, generation, paraphrasing, question answering, etc.) and sensitiv...

متن کامل

Annotation Adaptation and Language Adaptation in NLP

Adaptation technologies are always useful in NLP when there is discrepancy between the training scenario and use scenario. They are also effective in alleviating the data scarcity problem. Domain adaptation is the most popular kind of adaptation technologies and is intensively researched. In this talk we will introduce two other kinds of adaptation technologies: annotation adaptation and langua...

متن کامل

Cross-Domain and Cross-Language Porting of Shallow Parsing

English was the main focus of attention of the Natural Language Processing (NLP) community for years. As a result, there are significantly more annotated linguistic resources in English than in any other language. Consequently, data-driven tools for automatic text or speech processing are developed mainly for English. Developing similar corpora and tools for other languages is an important issu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018