HokuMed in NTCIR-11 MedNLP-2: Automatic Extraction of Medical Complaints from Japanese Health Records Using Machine Learning and Rule-based Methods

نویسندگان

  • Magnus Ahltorp
  • Hideyuki Tanushi
  • Shiho Kitajima
  • Maria Skeppstedt
  • Rafal Rzepka
  • Kenji Araki
چکیده

A conditional random fields model was trained to detect medical complaints in Japanese health record text. Tokenisation was applied by using the dependency parser CaboCha and the conditional random fields model was trained on tokens in a window size of two preceding and three following tokens, as well as on part-of-speech, vocabulary mapping, header name, frequent suffix, orthography and presence of a modality cue. Modality detection relied on dictionaries of cues for negation, suspicion and family. The scope of negation and suspicion cues was determined by rules relying on the output of CaboCha. For negation and family, cues were gathered by scanning the development corpus for cues, while suspicion cues were obtained by translating English cues. The best result achieved for recognizing complaints was a precision of 87% and a recall of 77%. For modality detection, positive was detected with a precision of 87% and a recall of 77%, negation with a precision of 76% and a recall of 69%, suspicion with a precision 49% and a recall of 51%, and family with a precision of 78% and a recall of 81%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HCRL at NTCIR-11 MedNLP-2 Task

This year’s MedNLP-2 [1] has two tasks: Extraction task (Task 1) and Normalization task (Task 2). We tested both machine learning based methods and an ad-hoc rule-based method for the two tasks. For the Extraction Task, a two-stage approach (first, the machine learning based method is applied to identify c tags, and second, the rule-based method is applied to modality features) obtained higher ...

متن کامل

kyoto: Kyoto University Baseline at the NTCIR-11 MedNLP-2 Task

Since more electronic records are now used at medical scenes, the importance of technical development for analyzing such electronically provided information has been increasing significantly. This NTCIR-11 MedNLP-2 Task is designed to meet this situation. This task is a shared task that evaluates natural language processing technologies especially on Japanese medical texts. The task has three s...

متن کامل

Developing ML-based Systems to Extract Medical Information from Japanese Medical History Summaries

With the increase of the number of medical records written in an electronic format, natural language processing techniques in the medical domain have become more and more important. For the purpose of the development and evaluation of machine learning-based systems to extract medical information, we recently participated in the NTCIR-10 MedNLP task. The task focused on Japanese medical records ...

متن کامل

BARY at the NTCIR-11 MedNLP-2 Task for Complaints and Diagnosis Recognition

This paper describes a machine-learning based approach to recognizing diagnosed disease names and corresponding temporal expressions. Using CRFs (conditional random fields) to learn and predict tags, the systems described in this paper are characterized by a character-level formulation and heuristic features extracted from medical terminologies. Experimental results on the NTCIR-11 MedNLP-2 dat...

متن کامل

SCT-D3 at the NTCIR-11 MedNLP-2 Task

The SCT-D3 team participated in the Extraction of Complaint and Diagnosis subtask and the Normalization of Complaint and Diagnosis subtask of the NTCIR-11 MedNLP2 Task. We tackled the two subtasks by using machine learning techniques and additional medical term dictionaries. This report outlines the methods we used to obtain our experimental results, and describes our practical evaluation.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014