Evaluation of Logistic Regression Model with Feature Selection Methods on Medical Dataset
نویسنده
چکیده
gression enable us to investigate the relationship between a categorical outcome and a set of explanatory variables. The outcome or response can be either dichotomous (yes, no) or ordinal (low, medium, high). During dichotomous response, we are performing standard logistic regression and for ordinal response, model that uses standard logistic regression formula with feature selection using forward selection and backward elimination methods and has been evaluated for the effectiveness of the results on publicly available medical datasets. The process of evaluation is as follows. The feature selection algorithm using forward selection and backward elimination method is applied on the dataset and the selected features performance of the predictive model. From the experimental results it is observed that logistic regression model with feature selection using forward selection and backward elimination methods gives more reliable result than the logistic regression model.
منابع مشابه
Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets
Objective(s): This study addresses feature selection for breast cancer diagnosis. The present process uses a wrapper approach using GA-based on feature selection and PS-classifier. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer datasets. Materials and Methods: To evaluate effectiveness of proposed feature selection method, we ...
متن کاملExtracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem
Application of data mining methods as a decision support system has a great benefit to predict survival of new patients. It also has a great potential for health researchers to investigate the relationship between risk factors and cancer survival. But due to the imbalanced nature of datasets associated with breast cancer survival, the accuracy of survival prognosis models is a challenging issue...
متن کاملClassification with correlated features: unreliability of feature ranking and solutions
MOTIVATION Classification and feature selection of genomics or transcriptomics data is often hampered by the large number of features as compared with the small number of samples available. Moreover, features represented by probes that either have similar molecular functions (gene expression analysis) or genomic locations (DNA copy number analysis) are highly correlated. Classical model selecti...
متن کاملFeature Extraction and Efficiency Comparison Using Dimension Reduction Methods in Sentiment Analysis Context
Nowadays, users can share their ideas and opinions with widespread access to the Internet and especially social networks. On the other hand, the analysis of people's feelings and ideas can play a significant role in the decision making of organizations and producers. Hence, sentiment analysis or opinion mining is an important field in natural language processing. One of the most common ways to ...
متن کاملThe Prediction of Booking Destination On Airbnb Dataset
This report is about analysis of the Airbnb dataset and the model we built to do the prediction task on the dataset. The dataset comes from an ongoing kaggle competition supported by Airbnb. We first did some comprehensive analysis on the dataset, explored most features and collected all features we thought was useful. Then we described and interpreted the prediction task and the evaluation met...
متن کامل