A hybrid solution for extracting structured medical information from unstructured data in medical records via a double-reading/entry system

نویسندگان

  • Ligang Luo
  • Liping Li
  • Jiajia Hu
  • Xiaozhe Wang
  • Boulin Hou
  • Tianze Zhang
  • Lue Ping Zhao
چکیده

BACKGROUND Healthcare providers generate a huge amount of biomedical data stored in either legacy system (paper-based) format or electronic medical records (EMR) around the world, which are collectively referred to as big biomedical data (BBD). To realize the promise of BBD for clinical use and research, it is an essential step to extract key data elements from unstructured medical records into patient-centered electronic health records with computable data elements. Our objective is to introduce a novel solution, known as a double-reading/entry system (DRESS), for extracting clinical data from unstructured medical records (MR) and creating a semi-structured electronic health record database, as well as to demonstrate its reproducibility empirically. METHODS Utilizing the modern cloud-based technologies, we have developed a comprehensive system that includes multiple subsystems, from capturing MRs in clinics, to securely transferring MRs, storing and managing cloud-based MRs, to facilitating both machine learning and manual reading, and to performing iterative quality control before committing the semi-structured data into the desired database. To evaluate the reproducibility of extracted medical data elements by DRESS, we conduct a blinded reproducibility study, with 100 MRs from patients who have undergone surgical treatment of lung cancer in China. The study uses Kappa statistic to measure concordance of discrete variables, and uses correlation coefficient to measure reproducibility of continuous variables. RESULTS Using the DRESS, we have demonstrated the feasibility of extracting clinical data from unstructured MRs to create semi-structured and patient-centered electronic health record database. The reproducibility study with 100 patient's MRs has shown an overall high reproducibility of 98 %, and varies across six modules (pathology, Radio/chemo therapy, clinical examination, surgery information, medical image and general patient information). CONCLUSIONS DRESS uses a double-reading, double-entry, and an independent adjudication, to manually curate structured data elements from unstructured clinical data. Further, through distributed computing strategies, DRESS protects data privacy by dividing MR data into de-identified modules. Finally, through internet-based computing cloud, DRESS enables many data specialists to work in a virtual environment to achieve the necessary scale of processing thousands MRs within days. This hybrid system represents probably a workable solution to solve the big medical data challenge.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing a Standardized Medical Speech Recognition Database for Reconstructive Hand Surgery

Fast and holistic access to the patients’ clinical record is a major requirement of modern medical decision support systems (DSS). While electronic health records (EHRs) have replaced the traditional paper-based records in most healthcare organization, the data entry into these systems remains largely manual. Speech recognition technology promises substitution of the more convenient speech-base...

متن کامل

Structured data entry for narrative data in a broad specialty: patient history and physical examination in pediatrics

BACKGROUND Whereas an electronic medical record (EMR) system can partly address the limitations, of paper-based documentation, such as fragmentation of patient data, physical paper records missing and poor legibility, structured data entry (SDE, i.e. data entry based on selection of predefined medical concepts) is essential for uniformity of data, easier reporting, decision support, quality ass...

متن کامل

The Content and Structure of Electronic Personal Health Records: A Systematic Review

Introduction: The electronic Personal Health Record (ePHR) improves people’s awareness and care management and leads to health promotion. One of the most important factors that contributes to the development of ePHR is identifying and understanding its content and structure. No comprehensive studies have so far been performed on the content and structure of ePHRs. Therefore, the purpose of this...

متن کامل

The Content and Structure of Electronic Personal Health Records: A Systematic Review

Introduction: The electronic Personal Health Record (ePHR) improves people’s awareness and care management and leads to health promotion. One of the most important factors that contributes to the development of ePHR is identifying and understanding its content and structure. No comprehensive studies have so far been performed on the content and structure of ePHRs. Therefore, the purpose of this...

متن کامل

Minimum data set for electronic health card of schizophrenia

Purpose: Having a clinical information system is a good solution for monitoring medical problems. This system is designed to improve the speed and accuracy of data management. The goal is to replace medical records with a clinical information system to support storing, processing and distributing data in all the sections of a healthcare center. The purpose of this research was to determine ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 16  شماره 

صفحات  -

تاریخ انتشار 2016