The GNAT library for local and remote gene mention normalization
نویسندگان
چکیده
منابع مشابه
The GNAT library for local and remote gene mention normalization
SUMMARY Identifying mentions of named entities, such as genes or diseases, and normalizing them to database identifiers have become an important step in many text and data mining pipelines. Despite this need, very few entity normalization systems are publicly available as source code or web services for biomedical text mining. Here we present the Gnat Java library for text retrieval, named enti...
متن کاملGene mention normalization in full texts using GNAT and LINNAEUS
Gene mention normalization (GN) refers to the automated mapping of gene names to a unique identifier, such as an NCBI Entrez Gene ID. Such knowledge helps in indexing and retrieval, linkage to additional information (such as sequences), database curation, and data integration. We present here an ensemble system encompassing LINNAEUS for recognizing organism names and GNAT for recognition and no...
متن کاملMe and my friends: gene mention normalization with background knowledge
“Tell me who your friends are, and I will tell you who you are” – this proverb best illustrates our approach to the normalization of gene names. In this approach, we rely on background knowledge that describes various aspects of a gene: it is localized on a chromosomal band, it belongs to an operon structure, it is a member of a gene family, its products take part in biological processes, they ...
متن کاملInter-species normalization of gene mentions with GNAT
MOTIVATION Text mining in the biomedical domain aims at helping researchers to access information contained in scientific publications in a faster, easier and more complete way. One step towards this aim is the recognition of named entities and their subsequent normalization to database identifiers. Normalization helps to link objects of potential interest, such as genes, to detailed informatio...
متن کاملIntegrated cTAKES for Concept Mention Detection and Normalization
We participated Task 1 using an existing system MedTagger implemented in integrated cTAKES (icTAKES). The concept mention detection is based on Conditional Random Fields (CRF) and the concept mention normalization is based on a greedy dictionary lookup algorithm. A distinctive feature in MedTagger compared to other concept mention detection systems is the incorporation of dictionary lookup resu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bioinformatics
سال: 2011
ISSN: 1367-4803,1460-2059
DOI: 10.1093/bioinformatics/btr455