A Bootstrapping Approach to Symptom Entity Extraction on Chinese Electronic Medical Records
نویسندگان
چکیده
Symptom entities are widely distributed in Chinese electronic medical records. Previous approaches on symptom entity extraction usually extract continuous strings as symptom entities and require massive human efforts on corpus annotation. We describe the symptom entity as two-tuples of and design a soft pattern matching method to locate them in sentences in the EMR. Our bootstrapping approach which only requires a few annotated symptom tuples and it allows iterative extraction from mass electronic medical record databases without human supervision. Furthermore, the described method annotates symptom entities in EMR by the extracted tuples. Starting with 60 annotated entities, our approach reached an F value of 81.40% in the extraction task of 3,150 entities from 992 sets of electronic medical records.
منابع مشابه
Data-Driven Information Extraction from Chinese Electronic Medical Records
OBJECTIVE This study aims to propose a data-driven framework that takes unstructured free text narratives in Chinese Electronic Medical Records (EMRs) as input and converts them into structured time-event-description triples, where the description is either an elaboration or an outcome of the medical event. MATERIALS AND METHODS Our framework uses a hybrid approach. It consists of constructin...
متن کاملAutomatic medical concept extraction from free text clinical reports, a new named entity recognition approach
Actually in the Hospital Information Systems, there is a wide range of clinical information representation from the Electronic Health Records (EHR), and most of the information contained in clinical reports is written in natural language free text. In this context, we are researching the problem of automatic clinical named entities recognition from free text clinical reports. We are using Snome...
متن کاملSelf-Adjustable BootStrapping for Web-Scale Named Entity Extraction using N-grams
Named Entity Extraction refers to task of identifying and extracting mentions of names like person names, locations, time expressions, monetary values etc from text. There have different approaches to Named Entity extraction and classification based on supervised and semi-supervised learning. This paper describes a bootstrapping approach to extracing Named Entities for 150 categories from Wikip...
متن کاملUnsupervised Extraction of Diagnosis Codes from EMRs Using Knowledge-Based and Extractive Text Summarization Techniques
Diagnosis codes are extracted from medical records for billing and reimbursement and for secondary uses such as quality control and cohort identification. In the US, these codes come from the standard terminology ICD-9-CM derived from the international classification of diseases (ICD). ICD-9 codes are generally extracted by trained human coders by reading all artifacts available in a patient's ...
متن کاملAutomatic Discovery of Named Entity Variants: Grammar-driven Approaches to Non-Alphabetical Transliterations
Identification of transliterated names is a particularly difficult task of Named Entity Recognition (NER), especially in the Chinese context. Of all possible variations of transliterated named entities, the difference between PRC and Taiwan is the most prevalent and most challenging. In this paper, we introduce a novel approach to the automatic extraction of diverging transliterations of foreig...
متن کامل