Identifying Segment Topics in Medical Dictations
نویسندگان
چکیده
In this paper, we describe the use of lexical and semantic features for topic classification in dictated medical reports. First, we employ SVM classification to assign whole reports to coarse work-type categories. Afterwards, text segments and their topic are identified in the output of automatic speech recognition. This is done by assigning work-type-specific topic labels to each word based on features extracted from a sliding context window, again using SVM classification utilizing semantic features. Classifier stacking is then used for a posteriori error correction, yielding a further improvement in classification accuracy.
منابع مشابه
Revealing the Structure of Medical Dictations with Conditional Random Fields
Automatic processing of medical dictations poses a significant challenge. We approach the problem by introducing a statistical framework capable of identifying types and boundaries of sections, lists and other structures occurring in a dictation, thereby gaining explicit knowledge about the function of such elements. Training data is created semiautomatically by aligning a parallel corpus of co...
متن کاملImproving language model perplexity and recognition accuracy for medical dictations via within-domain interpolation with literal and semi-literal corpora
We propose a technique for improving language modeling for automated speech recognition of medical dictations by interpolating finished text (25M words) with small humangenerated literal or/and machine-generated semiliteral corpora. By building and testing interpolated (ILM) with literal (LILM), semiliteral (SILM) and partial (PILM) corpora, we show that both perplexity and recognition results ...
متن کاملModeling Filled Pauses in Medical Dictations
Filled pauses are characteristic of spontaneous speech and can present considerable problems for speech recognition by being often recognized as short words. An um can be recognized as thumb or arm if the recognizer's language model does not adequately represent FP's. Recognition of quasi-spontaneous speech (medical dictation) is subject to this problem as well. Results from medical dictations ...
متن کاملAutomatic Prediction of Trauma Registry Procedure Codes from Emergency Room Dictations
Current natural language processing techniques for recognition of concepts in the electronic medical record have been insufficient to allow their broad use for coding information automatically. We have undertaken a preliminary investigation into the use of machine learning methods to recognize procedure codes from emergency room dictations for a trauma registry. Our preliminary results indicate...
متن کاملShifting indirect patient care duties to after hours in the era of work hours restrictions.
PURPOSE Few data describe how often residents defer indirect patient care tasks to after hours or show whether residents report this time in duty hours logs. Thus, the authors examined how often residents perform one such task, discharge dictation, outside scheduled hours. METHOD The authors tracked all discharge summaries dictated by internal medicine residents at a single teaching hospital ...
متن کامل