Recent neural network models are getting bigger to increase the performance to the limit. This paper suggests MobileNet to reduce the size of neural network small enough to deploy on mobile devices.


Several techniques are used for MobileNet.


The most important component of MobileNet is depthwise separable convolution. Assume a feature map of . Standard convolution filters consists of N number of filters of size . Instead, depthwise separable convolution replace this with M number of depthwise convolution of size , and N number of pointwise convolution of size .


New efficient architecture based on this method not only reduce the number of multi-add, but also concentrate the computation on pointwise convolution layer which is one of the most efficient operation by general matrix multiply(GEMM).

Mobilenet introduced two additional hyperparameters to reduce the computation. Width multiplier is used to reduce the number of channels on each layer. Resolution multiplier is used to reduce the height and width on each layer. The number of computation is reduced from



image MobileNet decreased the number of parameters and computations dramatically with slight decrease in performance on various tasks including classification and detection.


It is amazing that the convolution filters can be represented with depthwise convolution and pointwise convolution while preserving much of its representational power. Could there be similar method of RNN series?

Howard, Andrew G., et al. “Mobilenets: Efficient convolutional neural networks for mobile vision applications.” arXiv preprint arXiv:1704.04861 (2017).