Slimmable Neural Networks

What are Slimmable Neural Networks?


A Slimmable Neural Network is a single neural network executable at different widths (number of active channels). paste-b41b6392ef1254058e87d648f8e04f17b0f3f067.jpg

This permits instant and adaptive accuracy-efficiency trade-offs at runtime.

What is the training algorithm to train Slimmable Neural Networks?


paste-4619681eabe23c869f41b376f0ae8e235495e51a.jpg

Or in other words: apply each batch over the slimmed networks (which share weights for each layer but have an individual batchnorm layer) and accumulate the loss. Do a weight update at the end.

What was the main difficulty to make Slimmable Neural Networks work and how was it solved?


For each layer, different channels/switches result in different means and variances of the aggregated feature, which are then rolling averaged to a shared batch normalization layer. The inconsistency leads to inaccurate batch normalization statistics in a layer-by-layer propagating manner.

Note that these batch normalization statistics (moving averaged means and variances) are only used during testing, in training the means and variances of the current mini-batch are used.

The solution was the introduction of Switchable Batch Normalization (S-BN), that employs independent batch normalization for different switches in a slimmable network.

paste-2831e9519fc33eba4fb0bb50166a0c45efdd85ae.jpg

Machine Learning Research Flashcards is a collection of flashcards associated with scientific research papers in the field of machine learning. Best used with Anki or Obsidian. Edit MLRF on GitHub.