A common goal in statistics and machine learning is to learn models that can perform well against distributional shifts, such as latent heterogeneous subpopulations, unknown covariate shifts or unmodeled temporal effects. We develop analyze a distributionally robust stochastic optimization (DRO) framework learns model providing good performance perturbations the data-generating distribution. gi...