Robust Covariate Shift Prediction with General Losses and Feature Views

نویسندگان

  • Anqi Liu
  • Brian D. Ziebart
چکیده

Covariate shift relaxes the widely-employed independent and identically distributed (IID) assumption by allowing different training and testing input distributions. Unfortunately, common methods for addressing covariate shift by trying to remove the bias between training and testing distributions using importance weighting often provide poor performance guarantees in theory and unreliable predictions with high variance in practice. Recently developed methods that construct a predictor that is inherently robust to the difficulties of learning under covariate shift are restricted to minimizing logloss and can be too conservative when faced with high-dimensional learning tasks. We address these limitations in two ways: by robustly minimizing various loss functions, including non-convex ones, under the testing distribution; and by separately shaping the influence of covariate shift according to different feature-based views of the relationship between input variables and example labels. These generalizations make robust covariate shift prediction applicable to more task scenarios. We demonstrate the benefits on classification under covariate shift tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Kernel Robust Bias-Aware Prediction under Covariate Shift

Under covariate shift, training (source) data and testing (target) data differ in input space distribution, but share the same conditional label distribution. This poses a challenging machine learning task. Robust Bias-Aware (RBA) prediction provides the conditional label distribution that is robust to the worstcase logarithmic loss for the target distribution while matching feature expectation...

متن کامل

Dimension Reduction for Robust Covariate Shift Correction

In the covariate shift learning scenario, the training and test covariate distributions differ, so that a predictor’s average loss over the training and test distributions also differ. The importance weighting approach handles this shift by minimizing an estimate of test loss over predictors, obtained via a weighted sum over training sample losses. However, as the dimension of the covariates in...

متن کامل

On semi-supervised linear regression in covariate shift problems

Semi-supervised learning approaches are trained using the full training (labeled) data and available testing (unlabeled) data. Demonstrations of the value of training with unlabeled data typically depend on a smoothness assumption relating the conditional expectation to high density regions of the marginal distribution and an inherent missing completely at random assumption for the labeling. So...

متن کامل

On Covariate Shift Adaptation via Sparse Filtering

A major challenge in machine learning is covariate shift, i.e., the problem of training data and test data coming from different distributions. This paper studies the feasibility of tackling this problem by means of sparse filtering. We show that the sparse filtering algorithm intrinsically addresses this problem, but it has limited capacity for covariate shift adaptation. To overcome this limi...

متن کامل

Robust Covariate Shift Regression

In many learning settings, the source data available to train a regression model differs from the target data it encounters when making predictions due to input distribution shift. Appropriately dealing with this situation remains an important challenge. Existing methods attempt to “reweight” the source data samples to better represent the target domain, but this introduces strong inductive bia...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1712.10043  شماره 

صفحات  -

تاریخ انتشار 2017