Reducing .tflite model size

问题

Any of the zoo .tflite models I see are no more than 3MB in size. On an edgetpu they run fine. However, when I train my own object detection model the .pb file is 60MB and the .tflite is also huge at 20MB! It's also quantized as per below. The end result is segmentation faults on an edgetpu object_detection model. What's causing this file to be so large? Do non-resized images being fed into the model cause the model to be large (some photos were 4096×2160 and not resized)?

From object_detection

Train the model

python train.py \
--logtostderr \
--train_dir=training \
--pipeline_config_path=training/ssd_mobilenet_v1_coco.config

Freeze the graph - creates 60MB .pb file

python export_tflite_ssd_graph.py \
--pipeline_config_path=training/ssd_mobilenet_v2_coco.config \
--trained_checkpoint_prefix=training/model.ckpt-2020 \
--output_directory=inference_graph \
--add_postprocessing_op=true

Convert to .tflite - creates 20MB .tflite file

tflite_convert 
--graph_def_file=inference_graph/tflite_graph.pb \
--output_file=inference_graph/detect.tflite \
--inference_type=QUANTIZED_UINT8 \
--input_shapes=1,300,300,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays=TFLite_Detection_PostProcess,TFLite_Detection_PostProcess:1,TFLite_Detection_PostProcess:2,TFLite_Detection_PostProcess:3 \
--mean_values=128 \
--std_dev_values=127 \
--allow_custom_ops \
--default_ranges_min=0 \
--default_ranges_max=6

At this stage the .tflite file is pushed to the google coral edgetpu and the model is trialed on a USB camera attached to the TPU.

export DISPLAY=:0 && edgetpu_detect \
--source /dev/video1:YUY2:1280x720:20/1  \
--model ${DEMO_FILES}/detect.tflite

End result is Segmentation Error.

INFO: Initialized TensorFlow Lite runtime.
glvideomixer name=mixer background=black ! glimagesink sync=False name=glsink qos=False
v4l2src device=/dev/video1 ! video/x-raw,height=720,framerate=20/1,format=YUY2,width=1280 ! glupload ! tee name=t
t. ! glupload ! queue ! mixer.
overlaysrc name=overlay ! video/x-raw,height=720,width=1280,format=BGRA ! glupload ! queue max-size-buffers=1 ! mixer.
t. ! queue max-size-buffers=1 leaky=downstream ! glfilterbin filter=glcolorscale ! video/x-raw,height=168,width=300,format=RGBA ! videoconvert ! video/x-raw,height=168,width=300,format=RGB ! videobox autocrop=True ! video/x-raw,height=300,width=300 ! appsink max-buffers=1 sync=False emit-signals=True drop=True name=appsink
Segmentation fault

回答1:

The issue here may be due to the fact that you have 2 different config files for each step:

python train.py \
...
--pipeline_config_path=training/ssd_mobilenet_v1_coco.config

python export_tflite_ssd_graph.py \
--pipeline_config_path=training/ssd_mobilenet_v2_coco.config \
...

Was this meant to be? Also, it looks like you deployed the model immediately after training without compiling it. Please refer to this doc for more info on the edgetpu_compiler: https://coral.withgoogle.com/docs/edgetpu/compiler/

AFAIK, a 20MB model should run just fine as long as it meets all requirements listed on the page:

Tensor parameters are quantized (8-bit fixed-point numbers).
Tensor sizes are constant at compile-time (no dynamic sizes).
Model parameters (such as bias tensors) are constant at compile-time.
Tensors are either 1-, 2-, or 3-dimensional. If a tensor has more than 3 dimensions, then only the 3 innermost dimensions may have a size greater than 1.
The model uses only the operations supported by the Edge TPU. The listed operations are here: https://coral.withgoogle.com/docs/edgetpu/models-intro/#supported-operations

Your whole pipeline should be to:

1) Train the model

2) Convert to tflite

3) Compiled for EdgeTPU (the step that actually delegates the work onto the TPU)

Hope this helps.

来源：https://stackoverflow.com/questions/58561680/reducing-tflite-model-size

标签

python