tf object detection api - extract feature vector for each detection bbox

前端 未结 3 932
-上瘾入骨i
-上瘾入骨i 2021-02-04 18:24

I\'m using Tensorflow object detection API and working on pretrainedd ssd-mobilenet model. is there a way to extact the last global pooling of the mobilenet for each bbox as a f

3条回答
  •  粉色の甜心
    2021-02-04 18:47

    As Steve said the feature vectors in Faster RCNN in the object-detection api seem to get dropped after the SecondStageBoxPredictor. I was able to thread them through the network by modifying the core/box_predictor.py and meta_architectures/faster_rcnn_meta_arch.py.

    The crux of it is that the non-max suppression code actually has a parameter for additional_fields (see core/post_processing.py:176 on master). You can pass a dict of tensors which have the same shape in the first two dimensions as the boxes and scores and the function will return them filtered the same way as the boxes and scores have been. Here's a diff against master of the changes I made:

    https://gist.github.com/donniet/c95d19e00ff9abeb786415b3a9348e62

    Then instead of loading a frozen graph I had to rebuild the network and load the variables from a checkpoint like this (note: I downloaded the checkpoint for faster rcnn from here: http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_2018_01_28.tar.gz)

    import sys
    import os
    import numpy as np
    
    from object_detection.builders import model_builder
    from object_detection.protos import pipeline_pb2
    
    from google.protobuf import text_format
    import tensorflow as tf
    
    # load the pipeline structure from the config file
    with open('object_detection/samples/configs/faster_rcnn_resnet101_coco.config', 'r') as content_file:
        content = content_file.read()
    
    # build the model with model_builder
    pipeline_proto = pipeline_pb2.TrainEvalPipelineConfig()
    text_format.Merge(content, pipeline_proto)
    model = model_builder.build(pipeline_proto.model, is_training=False)
    
    # construct a network using the model
    image_placeholder = tf.placeholder(shape=(None,None,3), dtype=tf.uint8, name='input')
    original_image = tf.expand_dims(image_placeholder, 0)
    preprocessed_image, true_image_shapes = model.preprocess(tf.to_float(original_image))
    prediction_dict = model.predict(preprocessed_image, true_image_shapes)
    detections = model.postprocess(prediction_dict, true_image_shapes)
    
    # create an input network to read a file
    filename_placeholder = tf.placeholder(name='file_name', dtype=tf.string)
    image_file = tf.read_file(filename_placeholder)
    image_data = tf.image.decode_image(image_file)
    
    # load the variables from a checkpoint
    init_saver = tf.train.Saver()
    sess = tf.Session()
    init_saver.restore(sess, 'object_detection/faster_rcnn_resnet101_coco_11_06_2017/model.ckpt')
    
    # get the image data
    blob = sess.run(image_data, feed_dict={filename_placeholder:'image.jpeg'})
    # process the inference
    output = sess.run(detections, feed_dict={image_placeholder:blob})
    
    # get the shape of the image_features
    print(output['image_features'].shape)
    

    Caveat: I didn't run the tensorflow unit tests against the changes I made, so consider them for demo purposes only, and more testing should be done to make sure they didn't break something else in the object detection api.

提交回复
热议问题