Differentially Private Learning with Kernels
نویسندگان
چکیده
In this paper, we consider the problem of differentially private learning where access to the training features is through a kernel function only. As mentioned in Chaudhuri et al. (2011), the problem seems to be intractable for general kernel functions in the standard learning model of releasing different private predictor. We study this problem in three simpler but practical settings. We first study an interactive model where the user sends its test points to a trusted learner (like search engines) and expects accurate but differentially private predictions. In the second model, the learner has access to a subset of the unlabeled test set using which it releases a predictor, which preserves privacy of the training data. (NIH, 2003) is an example of such publicly available test set. Our third model is similar to the traditional model, where learner is oblivious to the test set but the kernels are restricted to functions over vector spaces. For each of the models, we derive differentially private learning algorithms with provable “utility” or error bounds. Moreover, we show that our methods can also be applied to the traditional model where they demonstrate better dimensionality dependence when compared to the methods of (Rubinstein et al., 2009; Chaudhuri et al., 2011). Finally, we provide experimental validation of our methods.
منابع مشابه
Differentially-Private Learning of Low Dimensional Manifolds
In this paper, we study the problem of differentially-private learning of low dimensional manifolds embedded in high dimensional spaces. The problems one faces in learning in high dimensional spaces are compounded in differentially-private learning. We achieve the dual goals of learning the manifold while maintaining the privacy of the dataset by constructing a differentially-private data struc...
متن کاملPractical Differential Privacy in High Dimensions
Privacy-preserving, and more concretely differentially private machine learning, is concerned with hiding specific details in training datasets which contain sensitive information. Many proposed differentially private machine learning algorithms have promising theoretical properties, such as convergence to non-private performance in the limit of infinite data, computational efficiency, and poly...
متن کاملDifferentially Private Algorithms for Empirical Machine Learning
An important use of private data is to build machine learning classifiers. While there is a burgeoning literature on differentially private classification algorithms, we find that they are not practical in real applications due to two reasons. First, existing differentially private classifiers provide poor accuracy on real world datasets. Second, there is no known differentially private algorit...
متن کاملA Stability-based Validation Procedure for Differentially Private Machine Learning
Differential privacy is a cryptographically motivated definition of privacy which has gained considerable attention in the algorithms, machine-learning and datamining communities. While there has been an explosion of work on differentially private machine learning algorithms, a major barrier to achieving end-to-end differential privacy in practical machine learning applications is the lack of a...
متن کامل(Near) Dimension Independent Risk Bounds for Differentially Private Learning
In this paper, we study the problem of differentially private risk minimization where the goal is to provide differentially private algorithms that have small excess risk. In particular we address the following open problem: Is it possible to design computationally efficient differentially private risk minimizers with excess risk bounds that do not explicitly depend on dimensionality (p) and do...
متن کامل