Balanced knowledge distillation for long-tailed learning
نویسندگان
چکیده
Deep models trained on long-tailed datasets exhibit unsatisfactory performance tail classes. Existing methods usually modify the classification loss to increase learning focus classes, which unexpectedly sacrifice head In fact, this scheme leads a contradiction between two goals of learning, i.e., generalizable representations and facilitating for work, we explore knowledge distillation in scenarios propose novel framework, named Balanced Knowledge Distillation (BKD), disentangle achieve both simultaneously. Specifically, given teacher model, train student model by minimizing combination an instance-balanced class-balanced loss. The former benefits from sample diversity learns representation, while latter considers class priors facilitates We conduct extensive experiments several benchmark demonstrate that proposed BKD is effective framework scenarios, as well competitive method learning. Our source code available: https://github.com/EricZsy/BalancedKnowledgeDistillation.
منابع مشابه
Learning Loss for Knowledge Distillation with Conditional Adversarial Networks
There is an increasing interest on accelerating neural networks for real-time applications. We study the studentteacher strategy, in which a small and fast student network is trained with the auxiliary information provided by a large and accurate teacher network. We use conditional adversarial networks to learn the loss function to transfer knowledge from teacher to student. The proposed method...
متن کاملLearning Efficient Object Detection Models with Knowledge Distillation
Despite significant accuracy improvement in convolutional neural networks (CNN) based object detectors, they often require prohibitive runtimes to process an image for real-time applications. State-of-the-art models often use very deep networks with a large number of floating point operations. Efforts such as model compression learn compact models with fewer number of parameters, but with much ...
متن کاملUnsupervised Learning for Information Distillation
Current document archives are enormously large and constantly increasing and that makes it practically impossible to make use of them efficiently. To analyze and interpret large volumes of speech and text of these archives in multiple languages and produce structured information of interest to its user, information distillation techniques are used. In order to access the key information in resp...
متن کاملSequence-Level Knowledge Distillation
Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neura...
متن کاملKnowledge Distillation for Bilingual Dictionary Induction
Leveraging zero-shot learning to learn mapping functions between vector spaces of different languages is a promising approach to bilingual dictionary induction. However, methods using this approach have not yet achieved high accuracy on the task. In this paper, we propose a bridging approach, where our main contribution is a knowledge distillation training objective. As teachers, rich resource ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neurocomputing
سال: 2023
ISSN: ['0925-2312', '1872-8286']
DOI: https://doi.org/10.1016/j.neucom.2023.01.063