Complicated and deep neural network models can achieve high accuracy for image recognition. However, they require a huge amount of computations and model parameters, which are not suitable for mobile and embedded devices. Therefore, MobileNet was proposed, which can reduce the number of parameters and computational cost dramatically. The main idea of MobileNet is to use a depthwise separable convolution. Two hyper-parameters, a width multiplier and a resolution multiplier are used to the trade-off between the accuracy and the latency. In this paper, we propose a new architecture to improve the MobileNet. Instead of using the resolution multiplier, we use a depth multiplier and combine with either Fractional Max Pooling or the max pooling. Experimental results on CIFAR database show that the proposed architecture can reduce the amount of computational cost and increase the accuracy simultaneously.