named nafar

Using semantic roles to improve summaries

2011

Diana Trandabat

This paper describes preliminary analysis on the influence of the semantic roles in summary generation. The proposed method involves three steps: first, the named entities in the original text are identified using a named entity recognizer; secondly, the sentences are parsed and semantic roles are extracted; thirdly, selection of the sentences containing specific semantic roles for the most rel...

متن کامل

Exploring Predicate-Argument Relations for Named Entity Recognition in the Molecular Biology Domain

2005

Tuangthong Wattarujeekrit Nigel Collier

In this paper, the semantic relationships between a predicate and its arguments in terms of semantic roles are employed to improve lexical-based named entity recognition (NER) in the molecular biology domain. The semantic roles were realized in various sets of syntactic features used by a machine learning model to explore what should be the efficient way in allowing this knowledge to provide th...

متن کامل

Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition

2011

Stefan Rüd Massimiliano Ciaramita Jens Müller Hinrich Schütze

We use search engine results to address a particularly difficult cross-domain language processing task, the adaptation of named entity recognition (NER) from news text to web queries. The key novelty of the method is that we submit a token with context to a search engine and use similar contexts in the search results as additional information for correctly classifying the token. We achieve stro...

متن کامل

Lemmatization of Multi-word Common Noun Phrases and Named Entities in Polish

2017

Michal Marcinczuk

In the paper we present a tool for lemmatization of multi-word common noun phrases and named entities for Polish called PoLem1. The tool is based on a set of manually crafted rules and heuristics utilizing a set of dictionaries (including morphological, named entities and inflection patterns). The accuracy of lemmatization obtained by the tool reached 97.99% on a dataset with multi-word common ...

متن کامل

Learning Named Entity Recognition in Portuguese from Spanish

2005

Thamar Solorio Aurelio López-López

We present here a practical method for adapting a NER system for Spanish to Portuguese. The method is based on training a machine learning algorithm, namely a C4.5, using internal and external features. The external features are provided by a NER system for Spanish, while the internal features are automatically extracted from the documents. The experimental results show that the method performs...

متن کامل

Improving Named Entity Recognition using Annotated Corpora

2000

Mark Stevenson Robert Gaizauskas

Lists of names are an important knowledge source for many systems which carry out named entity recognition. It is shown that augmenting hand-crafted lists with those derived from corpora can improve their performance. Two methods for improving automatically acquired lists are presented. The best corpus-derived lists are shown to out-perform the hand-crafted ones by 4%.

متن کامل

ONER: Tool for Organization Named Entity Recognition from Affiliation Strings in PubMed Abstracts

Journal: :CoRR 2009

Siddhartha Jonnalagadda Philip Topham Graciela Gonzalez

Automatically extracting organization names from the affiliation sentences of articles related to biomedicine is of great interest to the pharmaceutical marketing industry, health care funding agencies and public health officials. It will also be useful for other scientists in normalizing author names, automatically creating citations, indexing articles and identifying potential resources or co...

متن کامل

Extracting Personal Names from Email: Applying Named Entity Recognition to Informal Text

2005

Einat Minkov Richard C. Wang William W. Cohen

There has been little prior work on Named Entity Recognition for ”informal” documents like email. We present two methods for improving performance of person name recognizers for email: emailspecific structural features and a recallenhancing method which exploits name repetition across multiple documents.

متن کامل

Structured Named Entities in two distinct press corpora: Contemporary Broadcast News and Old Newspapers

2012

Sophie Rosset Cyril Grouin Karën Fort Olivier Galibert Juliette Kahn Pierre Zweigenbaum

This paper compares the reference annotation of structured named entities in two corpora with different origins and properties. It addresses two questions linked to such a comparison. On the one hand, what specific issues were raised by reusing the same annotation scheme on a corpus that differs from the first in terms of media and that predates it by more than a century? On the other hand, wha...

متن کامل

Improving Named Entity Recognition in Tweets via Detecting Non-Standard Words

2015

Chen Li Yang Liu

Most previous work of text normalization on informal text made a strong assumption that the system has already known which tokens are non-standard words (NSW) and thus need normalization. However, this is not realistic. In this paper, we propose a method for NSW detection. In addition to the information based on the dictionary, e.g., whether a word is out-ofvocabulary (OOV), we leverage novel i...

متن کامل