Up to 100x Faster Data-Free Knowledge Distillation

نویسندگان

چکیده

Data-free knowledge distillation (DFKD) has recently been attracting increasing attention from research communities, attributed to its capability compress a model only using synthetic data. Despite the encouraging results achieved, state-of-the-art DFKD methods still suffer inefficiency of data synthesis, making data-free training process extremely time-consuming and thus inapplicable for large-scale tasks. In this work, we introduce an efficacious scheme, termed as FastDFKD, that allows us accelerate by factor orders magnitude. At heart our approach is novel strategy reuse shared common features in so synthesize different instances. Unlike prior optimize set independently, propose learn meta-synthesizer seeks initialization fast synthesis. As result, FastDFKD achieves synthesis within few steps, significantly enhancing efficiency training. Experiments over CIFAR, NYUv2, ImageNet demonstrate proposed 10x even 100x acceleration while preserving performances on par with state art. Code available at https://github.com/zju-vipa/Fast-Datafree.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data-Free Knowledge Distillation for Deep Neural Networks

Recent advances in model compression have provided procedures for compressing large neural networks to a fraction of their original size while retaining most if not all of their accuracy. However, all of these approaches rely on access to the original training set, which might not always be possible if the network to be compressed was trained on a very large dataset, or on a dataset whose relea...

متن کامل

Combining prior knowledge with data driven modeling of a batch distillation column including start-up

This paper presents the development of a simple model which describes the product quality and production over time of an experimental batch distillation column, including start-up. The model structure is based on a simple physical framework, which is augmented with fuzzy logic. This provides a way to use prior knowledge about the dynamics, which have a general validity, while additional informa...

متن کامل

Sequence-Level Knowledge Distillation

Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neura...

متن کامل

Topic Distillation with Knowledge Agents

This is the second year that our group participates in TREC’s Web track. Our experiments focused on the Topic distillation task. Our main goal was to experiment with the Knowledge Agent (KA) technology [1], previously developed at our Lab, for this particular task. The knowledge agent approach was designed to enhance Web search results by utilizing domain knowledge. We first describe the generi...

متن کامل

I-DIAG: From Community Discussion to Knowledge Distillation

I-DIAG is an attempt to understand how to take the collective discussions of a large group of people and distill the messages and documents into more succinct, durable knowledge. I-DIAG is a distributed environment that includes two separate applications, CyberForum and Consolidate. The goals of the project, the architecture of IDIAG, and the two applications are described here.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i6.20613