There are two types of deep neural networks here. Base network and detection network. MobileNet, VGG-Net, LeNet, and all of them are based on networks. The Base network provides high-level features for classification or detection. If you use a fully connected layer at the end of these networks, you have a classification. But you can remove the fully connected layer and replace it with detection networks, like SSD, Faster R-CNN, and so on.
In fact, SSD use of last convolutional layer on base networks for the detection task.
MobileNet just like other base networks uses convolution to produce high-level features.