Semi-Supervised Learning for Electronic Phenotyping in Support of Precision Medicine
نویسندگان
چکیده
Medical informatics plays an important role in precision medicine, delivering the right information to the right person, at the right time. With the introduction and widespread adoption of electronic medical records, in the United States and world-wide, there is now a tremendous amount of health data available for analysis. Electronic record phenotyping refers to the task of determining, from an electronic medical record entry, a concise descriptor of the patient, comprising of their medical history, current problems, presentation, etc. In inferring such a phenotype descriptor from the record, a computer, in a sense, “understands” the relevant parts of the record. These phenotypes can then be used in downstream applications such as cohort selection for retrospective studies, real-time clinical decision support, contextual displays, intelligent search, and precise alerting mechanisms. We are faced with three main challenges: First, the unstructured and incomplete nature of the data recorded in the electronic medical records requires special attention. Relevant information can be missing or written in an obscure way that the computer does not understand. Second, the scale of the data makes it important to develop efficient methods at all steps of the machine learning pipeline, including data collection and labeling, model learning and inference. Third, large parts of medicine are well understood by health professionals. How do we combine the expert knowledge of specialists with the statistical insights from the electronic medical record? Probabilistic graphical models such as Bayesian networks provide a useful abstraction for
منابع مشابه
Towards Patient-Driven Phenotyping and Similarity for Precision Medicine
Clinical phenotyping provides important insight into the manifestation and outcome of rare and complex diseases. Traditional phenotyping techniques often require multiple iterations of refinement with a domain expert, lack interoperability, and have limited reproducibility. In comparison, patient similarity-based techniques derive personalized patient risk models that are highly accurate, even ...
متن کاملElectronic phenotyping with APHRODITE and the Observational Health Sciences and Informatics (OHDSI) data network
The widespread usage of electronic health records (EHRs) for clinical research has produced multiple electronic phenotyping approaches. Methods for electronic phenotyping range from those needing extensive specialized medical expert supervision to those based on semi-supervised learning techniques. We present Automated PHenotype Routine for Observational Definition, Identification, Training and...
متن کاملSemi-supervised Learning for Phenotyping Tasks
Supervised learning is the dominant approach to automatic electronic health records-based phenotyping, but it is expensive due to the cost of manual chart review. Semi-supervised learning takes advantage of both scarce labeled and plentiful unlabeled data. In this work, we study a family of semi-supervised learning algorithms based on Expectation Maximization (EM) in the context of several phen...
متن کاملSemi-Supervised Learning of the Electronic Health Record with Denoising Autoencoders for Phenotype Stratification
Patient interactions with health care providers result in entries to electronic health records (EHRs). EHRs were built for clinical and billing purposes but contain many data points about an individual. Mining these records provides opportunities to extract electronic phenotypes that can be paired with genetic data to identify genes underlying common human diseases. This task remains challengin...
متن کاملEvaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms
Electronic Health Records are electronic data generated during or as a byproduct of routine patient care. Structured, semi-structured and unstructured EHR offer researchers unprecedented phenotypic breadth and depth and have the potential to accelerate the development of precision medicine approaches at scale. A main EHR use-case is defining phenotyping algorithms that identify disease status, ...
متن کامل