Dense Feature Fusion for Online Mutual Knowledge Distillation
نویسندگان
چکیده
Abstract Feature maps contain rich information about image intensity and spatial correlation. However, previous online knowledge distillation methods only utilize the class probabilities, Ignoring middle-level supervision, resulting in low efficiency training many models. Even if some have joined way is to define characteristic loss, effect general. We propose a new method of through fusion features between teacher academic network enter supervision. The specific fuse network, establish an auxiliary branch process information, so that feature can effectively strengthen interaction network. At same time, we added normalized integrated output, our reached SOTA KD. done lot experiments on cifar-10, cifar-100 ImageNet datasets, proved this more effective than other performance sub classifier, as well generating meaningful maps.
منابع مشابه
Effective Online Knowledge Graph Fusion
Recently, Web search engines have empowered their search with knowledge graphs to satisfy increasing demands of complex information needs about entities. Each engine offers an online knowledge graph service to display highly relevant information about the query entity in form of a structured summary called knowledge card. The cards from different engines might be complementary. Therefore, it is...
متن کاملEntanglement of Distillation and Conditional Mutual Information
In previous papers, we expressed the Entanglement of Formation in terms of Conditional Mutual Information (CMI). In this brief paper, we express the Entanglement of Distillation in terms of CMI.
متن کاملSequence-Level Knowledge Distillation
Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neura...
متن کاملKnowledge Distillation for Bilingual Dictionary Induction
Leveraging zero-shot learning to learn mapping functions between vector spaces of different languages is a promising approach to bilingual dictionary induction. However, methods using this approach have not yet achieved high accuracy on the task. In this paper, we propose a bridging approach, where our main contribution is a knowledge distillation training objective. As teachers, rich resource ...
متن کاملFeature Weight Driven Interactive Mutual Information Modeling for Heterogeneous Bio-Signal Fusion to Estimate Mental Workload
Many people suffer from high mental workload which may threaten human health and cause serious accidents. Mental workload estimation is especially important for particular people such as pilots, soldiers, crew and surgeons to guarantee the safety and security. Different physiological signals have been used to estimate mental workload based on the n-back task which is capable of inducing differe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of physics
سال: 2021
ISSN: ['0022-3700', '1747-3721', '0368-3508', '1747-3713']
DOI: https://doi.org/10.1088/1742-6596/1865/4/042084