Acceleration of Deep Convolutional Neural Networks Using Adaptive Filter Pruning

Pravendra Singh, Vinay Kumar Verma, Piyush Rai, Vinay Namboodiri

Research output: Contribution to journalArticlepeer-review

25 Citations (SciVal)
259 Downloads (Pure)

Abstract

While convolutional neural networks (CNNs) have achieved remarkable performance on various supervised and unsupervised learning tasks, they typically consist of a massive number of parameters. This results in significant memory requirements as well as a computational burden. Consequently, there is a growing need for filter-level pruning approaches for compressing CNN based models that not only reduce the total number of parameters but reduce the overall computation as well. We present a new min-max framework for the filter-level pruning of CNNs. Our framework jointly prunes and fine-tunes CNN model parameters, with an adaptive pruning rate, while maintaining the model’s predictive performance. Our framework consists of two modules: (1) An adaptive filter pruning (AFP) module, which minimizes the number of filters in the model; and (2) A pruning rate controller (PRC) module, which maximizes the accuracy during pruning. In addition, we also introduce orthogonality regularization in training of CNNs to reduce redundancy across filters of a particular layer. In the proposed approach, we prune the least important filters and, at the same time, reduce the redundancy level in the model by using orthogonality constraints during training. Moreover, unlike most previous approaches, our approach allows directly specifying the desired error tolerance instead of the pruning level. We perform extensive experiments for object classification (LeNet, VGG, MobileNet, and ResNet) and object detection (SSD, and Faster-RCNN) over benchmarked datasets such as MNIST, CIFAR, GTSDB, ImageNet, and MS-COCO. We also present several ablation studies to validate the proposed approach. Our compressed models can be deployed at run-time, without requiring any special libraries or hardware. Our approach reduces the number of parameters of VGG-16 by an impressive factor of 17.5X, and the number of FLOPS by 6.43X, with no loss of accuracy, significantly outperforming other state-of-the-art filter pruning methods.
Original languageEnglish
Article number9086749
Pages (from-to)838 - 847
Number of pages10
JournalIEEE Journal on Selected Topics in Signal Processing
Volume14
Issue number4
Early online date6 May 2020
DOIs
Publication statusPublished - 31 May 2020

Bibliographical note

Publisher Copyright:
© 2007-2012 IEEE.

Keywords

  • Deep convolutional neural network acceleration
  • efficient computation
  • model compression
  • pruning

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Acceleration of Deep Convolutional Neural Networks Using Adaptive Filter Pruning'. Together they form a unique fingerprint.

Cite this