- MobileNet V1 uses pointwise convolutions
- MobileNet V2 uses Residual Blocks and new First Layer of Projection Layer. This is called an inverted residual with linear bottleneck.
The idea of MobileNet V2 is based on the two ideas:
- Low Dimension Tensors reduce the number of computations/multiplications.
- Low Dimension Tensors only do not work well. They cannot extra a lot of information. MobileNet V2 addresses this by having the input be a low dimensional tensor, expanding it to a reasonable/high dimensional tensor, run a depthwise convolution on it, and squeeze it back into a low dimensional tensor.
The Green is an Expand Convolution. It increases the number of channels in. The Blue is a Depthwise Convolution. It keeps the number of channels the same and runs filters on the data. The Orange is a Projection Layer/Bottleneck Layer. Additionally there is a residual skip connection to keep information flow in the network.
- It is more efficient than MobileNet V1.
Some blocks keep channel size the same, others expand it until the final fully connected classification layer.
Note: SqueezeNet still has smaller memory usage