From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers

نویسندگان

  • Ingrid Falk
  • Delphine Bernhard
  • Christophe Gérard
چکیده

In this paper we present a statistical machine learning approach to formal neologism detection going some way beyond the use of exclusion lists. We explore the impact of three groups of features: form related, morpho-lexical and thematic features. The latter type of features has not yet been used in this kind of application and represents a way to access the semantic context of new words. The results suggest that form related features are helpful at the overall classification task, while morpho-lexical and thematic features better single out true neologisms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting linguistic knowledge to infer properties of neologisms by C . Paul Cook A thesis submitted in conformity with the requirements

Exploiting linguistic knowledge to infer properties of neologisms C. Paul Cook Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2010 Neologisms, or newly-coined words, pose problems for natural language processing (NLP) systems. Due to the recency of their coinage, neologisms are typically not listed in computational lexicons—dictionary-like resources that many...

متن کامل

Zeitgeist: A Computational Model of Neologism Processing

Language is a dynamic landscape in which words are not fixed landmarks, but fickle signposts that switch their directions as archaic senses are lost and new, more topical senses, are gained. Frequently, entirely new lexical signposts are added as newly minted word-forms enter the language. Some of these new forms are cut from whole cloth and have their origins in creative writing, movies or gam...

متن کامل

Using Word Alignment to Extend Multilingual Medical Terminologies

Medical terminologies such as those provided in the UMLS are never exhaustive and there is a constant need to enrich them, especially in terms of multilinguality. We present a methodology to acquire new French translations of English medical terms based on word alignment in a parallel corpus — i.e. pairing of corresponding words. We automatically collected a 27.7-million-word parallel, English-...

متن کامل

The Effect of Word Meaning on Speech DysFluency in Adults with Developmental Stuttering

Objectives: Stuttering is one of the most prevalent speech and language disorders. Symptomology of stuttering has been surveyed from different aspects such as biological, developmental, environmental, emotional, learning and linguistic. Previous researches in English-speaking people have suggested that some linguistic features such as word meanings may play a role in the frequency of speech non...

متن کامل

The Significance of Education and Gender in Persian Word-selection

This study strives to investigate the importance of ‘education’ and ‘gender’, as two major sociolinguistic variables, in accepting or rejecting the words coined by the Iranian Academy of Persian Language and Literature (APLL). A total of 500 students from state universities in Tehran were chosen as subjects and provided with a questionnaire consisting of 50 APLL equivalents. The respondents’ ac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014