A Copula Statistic for Measuring Nonlinear Dependence with Application to Feature Selection in Machine Learning

نویسندگان

  • Mohsen Ben Hassine
  • Lamine Mili
  • Kiran Karra
چکیده

Feature selection in machine learning aims to find out the best subset of variables from the input that reduces the computation requirement and improves the predictor performance. This paper introduces a new index based on empirical copulas, termed as the Copula Statistic (CoS) to assess the strength of statistical dependence and for testing statistical independence. It shows that this test exhibits higher statistical power than other indices. Finally, applying the CoS features selection in machine learning problems, which allow a demonstration of the good performance of the CoS. Keywords—Copula; multivariate dependence; nonlinear systems; feature selection; machine learning

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Robust-Equitable Measure for Feature Ranking and Selection

In many applications, not all the features used to represent data samples are important. Often only a few features are relevant for the prediction task. The choice of dependence measures often affect the final result of many feature selection methods. To select features that have complex nonlinear relationships with the response variable, the dependence measure should be equitable, a concept pr...

متن کامل

Bridging the semantic gap for software effort estimation by hierarchical feature selection techniques

Software project management is one of the significant activates in the software development process. Software Development Effort Estimation (SDEE) is a challenging task in the software project management. SDEE is an old activity in computer industry from 1940s and has been reviewed several times. A SDEE model is appropriate if it provides the accuracy and confidence simultaneously before softwa...

متن کامل

A Robust-Equitable Copula Dependence Measure for Feature Selection

Feature selection aims to select relevant features to improve the performance of predictors. Many feature selection methods depend on the choice of dependence measures. To select features that have complex nonlinear relationships with the response variable, the dependence measure should be equitable; i.e., it should treat linear and nonlinear relationships equally. In this paper, we introduce t...

متن کامل

A Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization

Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...

متن کامل

Machine learning based Visual Evoked Potential (VEP) Signals Recognition

Introduction: Visual evoked potentials contain certain diagnostic information which have proved to be of importance in the visual systems functional integrity. Due to substantial decrease of amplitude in extra macular stimulation in commonly used pattern VEPs, differentiating normal and abnormal signals can prove to be quite an obstacle. Due to developments of use of machine l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017