BNN Pruning: Pruning Binary Neural Network Guided by Weight Flipping Frequency

Title	BNN Pruning: Pruning Binary Neural Network Guided by Weight Flipping Frequency
Publication Type	Conference Proceedings
Year of Publication	2020
Authors	Li, YI, Ren, F
Conference Name	International Symposium on Quality Electronic Design (ISQED)
Date Published	03/2020
Conference Location	Santa Clara, CA
Keywords (or New Research Field)	psclab
Abstract	A binary neural network (BNN) is a compact form of neural network. Both the weights and activations in BNNs can be binary values, which leads to a significant reduction in both parameter size and computational complexity compared to their full-precision counterparts. Such reductions can directly translate into reduced memory footprint and computation cost in hardware, making BNNs highly suitable for a wide range of hardware accelerators. However, it is unclear whether and how a BNN can be further pruned for ultimate compactness. As both 0s and 1s are non-trivial in BNNs, it is not proper to adopt any existing pruning method of full- precision networks that interprets 0s as trivial. In this paper, we present a pruning method tailored to BNNs and illustrate that BNNs can be further pruned by using weight flipping frequency as an indicator of sensitivity to accuracy. The experiments performed on the binary versions of a 9- layer Network-in-Network (NIN) and the AlexNet with the CIFAR-10 dataset show that the proposed BNN-pruning method can achieve 20-40% reduction in binary operations with 0.5-1.0% accuracy drop, which leads to a 15-40% run- time speedup on a TitanX GPU.

File Attachment:

Copyright © 2015-2024 Parallel Systems and Computing Laboratory. All right reserved. | Site Admin: Fengbo Ren | Powered by Drupal.