Structured Bayesian Compression for Deep Neural Networks Based on the Turbo-VBI Approach
نویسندگان
چکیده
With the growth of neural network size, model compression has attracted increasing interest in recent research. As one most common techniques, pruning been studied for a long time. By exploiting structured sparsity network, existing methods can prune neurons instead individual weights. However, methods, surviving are randomly connected without any structure, and non-zero weights within each neuron also distributed. Such irregular sparse structure cause very high control overhead memory access hardware even increase computational complexity. In this paper, we propose three-layer hierarchical prior to promote more regular during pruning. The proposed achieve per-neuron weight-level neuron-level sparsity. We derive an efficient Turbo-variational Bayesian inferencing (Turbo-VBI) algorithm solve resulting problem with prior. Turbo-VBI low complexity support general priors than algorithms. Simulation results show that our pruned networks while achieving better performance terms rate accuracy compared baselines.
منابع مشابه
Integrating Deep Neural Networks into Structured Classification Approach based on Weighted Finite-State Transducers
Recently, deep neural networks (DNNs) have been drawing the attention of speech researchers because of their capability for handling nonlinearity in speech feature vectors. On the other hand, speech recognition based on structured classification is also considered important since it realizes the direct classification of automatic speech recognition. For example, a structured classification meth...
متن کاملBayesian Incremental Learning for Deep Neural Networks
In industrial machine learning pipelines, data often arrive in parts. Particularly in the case of deep neural networks, it may be too expensive to train the model from scratch each time, so one would rather use a previously learned model and the new data to improve performance. However, deep neural networks are prone to getting stuck in a suboptimal solution when trained on only new data as com...
متن کاملCompression of Deep Neural Networks on the Fly
Thanks to their state-of-the-art performance, deep neural networks are increasingly used for object recognition. To achieve the best results, they use millions of parameters to be trained. However, when targetting embedded applications the size of these models becomes problematic. As a consequence, their usage on smartphones or other resource limited devices is prohibited. In this paper we intr...
متن کاملAttention-Based Guided Structured Sparsity of Deep Neural Networks
Network pruning is aimed at imposing sparsity in a neural network architecture by increasing the portion of zero-valued weights for reducing its size regarding energyefficiency consideration and increasing evaluation speed. In most of the conducted research efforts, the sparsity is enforced for network pruning without any attention to the internal network characteristics such as unbalanced outp...
متن کاملLearning Structured Sparsity in Deep Neural Networks
High demand for computation resources severely hinders deployment of large-scale Deep Neural Networks (DNN) in resource constrained devices. In this work, we propose a Structured Sparsity Learning (SSL) method to regularize the structures (i.e., filters, channels, filter shapes, and layer depth) of DNNs. SSL can: (1) learn a compact structure from a bigger DNN to reduce computation cost; (2) ob...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Signal Processing
سال: 2023
ISSN: ['1053-587X', '1941-0476']
DOI: https://doi.org/10.1109/tsp.2023.3252165