When loading a model and using it for inference, I feed in an array of images, image_tile_tensor where the shape is (total_tile, tile_height, tile_width, 3)
image_tile_tensor
(total_tile, tile_height, tile_width, 3)