How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift
in Studies on Deep Learning, Deep Learning
WHY?
While the effect of batch normalization was widely proven empirically, the exact mechanism of it is yet been understood. Commonly known explanation for this was internal covariance shift(ICS) meaning the change in the distribution of layer inputs caused by updates to the preceeding layers.
WHAT?
Critic
So?
Ha, David, and Jürgen Schmidhuber. “World Models.” arXiv preprint arXiv:1803.10122 (2018).