Stochastic Learning on Imbalanced Data: Determinantal Point Processes for Mini-batch Diversification

نویسندگان

  • Cheng Zhang
  • Hedvig Kjellström
  • Stephan Mandt
چکیده

We study a mini-batch diversification scheme for stochastic gradient descent (SGD). While classical SGD relies on uniformly sampling data points to form a mini-batch, we propose a non-uniform sampling scheme based on the Determinantal Point Process (DPP). The DPP relies on a similarity measure between data points and gives low probabilities to mini-batches which contain redundant data, and higher probabilities to mini-batches with more diverse data. This simultaneously balances the data and leads to stochastic gradients with lower variance. We term this approach Balanced Mini-batch SGD (BM-SGD). We show that regular SGD and stratified sampling emerge as special cases. Furthermore, BM-SGD can be considered a generalization of stratified sampling to cases where no discrete features exist to bin the data into groups. We show experimentally that our method results more interpretable and diverse features in unsupervised setups, and in better classification accuracies in supervised setups.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Balanced Mini-batch Sampling for SGD Using Determinantal Point Processes

We study a mini-batch diversification scheme for stochastic gradient descent (SGD). While classical SGD relies on uniformly sampling data points to form a mini-batch, we propose a non-uniform sampling scheme based on the Determinantal Point Process (DPP). The DPP relies on a similarity measure between data points and gives low probabilities to mini-batches which contain redundant data, and high...

متن کامل

Kronecker Determinantal Point Processes

Determinantal Point Processes (DPPs) are probabilistic models over all subsets a ground set of N items. They have recently gained prominence in several applications that rely on “diverse” subsets. However, their applicability to large problems is still limited due to theO(N) complexity of core tasks such as sampling and learning. We enable efficient sampling and learning for DPPs by introducing...

متن کامل

Balanced Population Stochastic Variational Inference

Latent variable models are important tools to infer the underlying structure of a set of data. When we condition on observed data in Bayesian inference, we implicitly assume that the modeling assumptions are true and that the data can be considered a representative draw from the model. However, realistic data rarely agrees with these modeling assumptions. Especially when the observed data is hi...

متن کامل

Mini-batch Block-coordinate based Stochastic Average Adjusted Gradient Methods to Solve Big Data Problems

Big Data problems in Machine Learning have large number of data points or large number of features, or both, which make training of models difficult because of high computational complexities of single iteration of learning algorithms. To solve such learning problems, Stochastic Approximation offers an optimization approach to make complexity of each iteration independent of number of data poin...

متن کامل

Batched Gaussian Process Bandit Optimization via Determinantal Point Processes

Gaussian Process bandit optimization has emerged as a powerful tool for optimizing noisy black box functions. One example in machine learning is hyper-parameter optimization where each evaluation of the target function may require training a model which may involve days or even weeks of computation. Most methods for this so-called “Bayesian optimization” only allow sequential exploration of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1705.00607  شماره 

صفحات  -

تاریخ انتشار 2017