Standardizing Evaluation of Neural Network Pruning Jose Javier - - PowerPoint PPT Presentation
Standardizing Evaluation of Neural Network Pruning Jose Javier - - PowerPoint PPT Presentation
Standardizing Evaluation of Neural Network Pruning Jose Javier Gonzalez Davis Blalock John V. Guttag Overview Sh Shri rinkBench: Open source library to facilitate development and standardized evaluation of neural network pruning methods
1
Overview
Sh Shri rinkBench: Open source library to facilitate development and standardized evaluation of neural network pruning methods
- Rapid prototyping of NN pruning methods
- Makes it easy to use standardized datasets, pretrained
models and finetuning setups
- Controls for potential confounding factors
- Pretrained networks are often quite accurate but large
- Pr
Pruning: Systematically remove parameters from a network
2
Neural Network Pruning
- Go
Goal: Reduce size of network as much as possible with minimal drop in accuracy
- Often requires finetuning
afterwards
3
Neural Network Pruning
1 2 4 8 16
Compression Ratio
0.40 0.45 0.50 0.55 0.60 0.65 0.70
Accuracy
Accuracy of Pruned Networks
4
Traditional Pipeline
Need a whole pipeline for performing experiments
Data Model Pruning Algorithm Finetuning Evaluation
5
Traditional Pipeline
Data Model Pruning Algorithm Finetuning Evaluation
But only the pruning algorithm usually changes
But only the pruning algorithm usually changes
6
Traditional Pipeline
Data Model Pruning Algorithm Finetuning Evaluation
Duplicate effort & confounding variables
7
ShrinkBench
Library to facilitate standardized evaluation of pruning methods
Data Model Pruning Algorithm Finetuning Evaluation
shrinkbench
Utils
8
ShrinkBench
- Provides standardized datasets, pretrained models,
and evaluation metrics
- Simple and generic parameter masking API
- Measures nonzero parameters, activations, and FLOPs
- Controlled experiments show the need for
standardized evaluation
9
T
- wards Standardization
But how do we standardize?
- Standardized datasets.
Larger datasets (ImageNet) will be more insightful than smaller ones (CIFAR10)
- Standardized architectures
Crucial to match complexity of the network with complexity of dataset/task
- Pretrained models
This can be a confounding factor so it’s important to use the same
- Finetuning setup
We want improvement coming from pruning not just better hyperparameters
But how do we standardize?
- Standardized datasets.
Widely adopted datasets, representative of real-world tasks
- Standardized architectures
With reproducibility record, matched in complexity to the chosen dataset
- Pretrained models
Even for a fixed architecture and dataset, exact weights may affect results
- Finetuning setup
We want improvement from pruning, not from better hyperparameters
10
T
- wards Standardization
11
T
- wards Standardization
But how do we standardize?
- Standardized datasets.
Widely adopted datasets, representative of real-world tasks
- Standardized architectures
With reproducibility record, matched in complexity to the chosen dataset
- Pretrained models
Even for a fixed architecture and dataset, exact weights may affect results
- Finetuning setup
We want improvement from pruning, not from better hyperparameters
12
Masking API
We can capture an arbitrary removal pattern using binary masks Model (+ Data) Pruning Masks
- 2.1
4.6 0.8
- 0.1
0.2 1.5
- 4.9
2.3
- 2.5
2.7 4.2
- 1.1
- 0.3
5.0 3.1 4.7 1 1 1 1 1 1 1
- 2.1
4.6 0.8
- 0.1
0.2 1.5
- 4.9
2.3
- 2.5
2.7 4.2
- 1.1
- 0.3
5.0 3.1 4.7
- 2.1
4.6 0.8
- 0.1
0.2 1.5
- 4.9
2.3
- 2.5
2.7 4.2
- 1.1
- 0.3
5.0 3.1 4.7 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
13
Masks → Accuracy
Given a pruning method in terms of masks, ShrinkBench finetunes the model and systematically evaluates it
Pruning Masks
1 2 4 8 16
Compression Ratio
0.40 0.45 0.50 0.55 0.60 0.65 0.70
Accuracy
Accuracy Curve
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
14
ShrinkBench Results I
- ShrinkBench returns both compression & speedup
since they interact differently with pruning
Model Compression Speedup
15
ShrinkBench Results II
- ShrinkBench evaluates with varying compression and
with several (dataset, architecture) combinations
16
ShrinkBench Results II
- ShrinkBench evaluates with varying compression and
with several (dataset, architecture) combinations
17
ShrinkBench Results III
- ShrinkBench controls for confounding factors such as
pretrained weights or finetuning hyperparemeters
- ShrinkBench – an open source library to facilitate
development and standardized evaluation of neural network pruning methods
- Our controlled experiments across hundreds of models
demonstrate the need for standardized evaluation.
18