I have built the model below to take both an image input and some tabular data input into a keras model. The architecture is based on mobilenetv2 and implemented in keras: