Supervised Classification of Healthcare Text Data Based on Context-Defined Categories

نویسندگان

چکیده

Achieving a good success rate in supervised classification analysis of text dataset, where the relationship between and its label can be extracted from context, but not isolated words text, is still an important challenge facing fields statistics machine learning. For this purpose, we present novel mathematical framework. We then conduct comparative study established methods for case corresponding clearly depicted by specific text. In particular, use logistic LASSO, artificial neural networks, support vector machines, decision-tree-like procedures. This methodology applied to real involving mapping Consolidated Framework Implementation Research (CFIR) constructs health-related data achieves prediction over 80% when just first 55% or more, used training remaining testing. The results indicate that useful accelerate CFIR coding process.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

OmniCat: Automatic Text Classification with Dynamically Defined Categories

We present OmniCat, an ontology-based text categorization method that classifies documents into a dynamically defined set of categories specified as contexts in the domain ontology. The method does not require a training set and is based on measuring the semantic similarity of the thematic graph created from a text document and the ontology fragments created by the projection of the defined con...

متن کامل

Text Classification Based On Manifold Semi- Supervised Support Vector Machine

This article presents a solution along with experimental results for an application of semi-supervised machine learning techniques and improvement on the SVM (Support Vector Machine) based on geodesic model to build text classification applications for Vietnamese language. The objective here is to improve the semi-supervised machine learning by replacing the kernel function of SVM using geodesi...

متن کامل

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

Text Summarization Based on Conceptual Data Classification

In this paper, we present an original approach for text summarization using conceptual data classification. We show how a given text can be summarized without losing meaningful knowledge and without using any semantic or grammatical concepts. In fact, concept date classification is used to extract the most interacting sentences from the main text and ignoring the other meaningless sentences in ...

متن کامل

Semi-supervised Collaborative Text Classification

Most text categorization methods require text content of documents that is often difficult to obtain. We consider “Collaborative Text Categorization”, where each document is represented by the feedback from a large number of users. Our study focuses on the semisupervised case in which one key challenge is that a significant number of users have not rated any labeled document. To address this pr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematics

سال: 2022

ISSN: ['2227-7390']

DOI: https://doi.org/10.3390/math10122005