Tensorflow object detection next steps

Im trying to train a model to check images, identify specified objects and tell me its coodinates (i dont even need to see an square around the object).

For this im using Tensorflow's object detection and most of what I did was looking this tutorial:

How To Train an Object Detection Classifier for Multiple Objects Using TensorFlow (GPU) on Windows 10

But some things changed, probably because of updates, and then I had to do somethings on my own. I can actually train the model (I guess) but I don't understand the evaluation results. Im used to see loss and current step but this output is unusual for me. Also I don't think the training is being saved.

Training command line:

model_main.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_coco.config

Config file:

model {
  faster_rcnn {
    num_classes: 9
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    feature_extractor {
      type: 'faster_rcnn_inception_v2'
      first_stage_features_stride: 16
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 16
        width_stride: 16
      }
    }
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initializer {
          stddev: 0.01
        }
      }
    }
    first_stage_nms_score_threshold: 0.0
    first_stage_nms_iou_threshold: 0.7
    first_stage_max_proposals: 300
    first_stage_localization_loss_weight: 2.0
    first_stage_objectness_loss_weight: 1.0
    initial_crop_size: 14
    maxpool_kernel_size: 2
    maxpool_stride: 2
    second_stage_box_predictor {
      mask_rcnn_box_predictor {
        use_dropout: false
        dropout_keep_probability: 1.0
        fc_hyperparams {
          op: FC
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            variance_scaling_initializer {
              factor: 1.0
              uniform: true
              mode: FAN_AVG
            }
          }
        }
      }
    }
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.0
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 300
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
  }
}

train_config: {
  batch_size: 5
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0002
          schedule {
            step: 900000
            learning_rate: .00002
          }
          schedule {
            step: 1200000
            learning_rate: .000002
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  fine_tune_checkpoint: "faster_rcnn_inception_v2_coco_2018_01_28/model.ckpt"
  from_detection_checkpoint: true
  num_steps: 50000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "C:/tensorflow1/models/research/object_detection/images/train.record"
  }
  label_map_path: "C:/tensorflow1/models/research/object_detection/training/object-detection.pbtxt"
}

eval_config: {
  num_examples: 67
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "C:/tensorflow1/models/research/object_detection/images/test.record"
  }
  label_map_path: "C:/tensorflow1/models/research/object_detection/training/object-detection.pbtxt"
  shuffle: false
  num_readers: 1
}

Output:

2019-03-16 01:05:23.842424: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-03-16 01:05:23.842528: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-16 01:05:23.845561: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0
2019-03-16 01:05:23.845777: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N
2019-03-16 01:05:23.847854: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6390 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
creating index...
index created!
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.05s).
Accumulating evaluation results...
DONE (t=0.04s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.681
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 1.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.670
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.542
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.825
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.682
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.689
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.689
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.556
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.825
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000

Also the models inside faster_rcnn_inception_v2_coco_2018_01_28 have not been changed since Jan 2018, which probably means that even if it's training, it's not saving the progress.

My questions are:

Am I doing something wrong with the config or something else?
Is the training progress being saved ?
How can I understand this output? (IoU? maxDets? area? negative precision? is it for a single batch or what?)
Should I wait for this stops by itself eventually? I cant see at which step I am at and just this piece of output that I used as example here took almost 15 minutes to appear.

Wow, a lot of questions to answer here.

1 .I think your config file is correct, usually the fields that need to be carefully configured are:

num_classes: the number of classes of your dataset
fine_tune_checkpoint: the checkpoint to start the training with if you adopt tansfer learning, this should be provided if from_detection_checkpoint is set true.
label_map_path: path to your label file, the number of classes should be equal to num_classes
input_path in both train_input_reader and eval_input_reader
num_examples in eval_config, this is your validation dataset size, e.g. the number of examples in your validation dataset.
num_steps: this is the total number of training steps to reach before the model stops training.

2 Yes, your training process is being saved, it is saved at train_dir (if you are using the older version api, but model_dir if you are using the latest version), the official description is here. You can use tensorbard to visualize your training process.

3 The output if of COCO evaluation format as this is the default evalution metric option. But you can try other evalution metrics by setting metrics_set : in eval_config in the config file, other options are available here. For coco metrics, specifically:

IoU is Intersection over Union, this defines how much your detection bounding box overlaps with your groundtruth box. This answer provides more details for you to understand how the precision is calculated on different IoUs.
maxDets is thresholds on max detections per image (see here for better discussion)
area, there are three categories of area, it depends the number of pixels the area takes, small, medium and large are all defined here.
As for negative precision for category 'large', I think this is because this is the default value if no detections are categorized as 'large' (But I cannot confirm this, you may refer to the official coco website http://cocodataset.org/#home)
The evaluation is performed on the whole validation dataset, so all images in your validation set.
This file provides more details on coco metrics

4 The training will stop once the total number of training step is reached to num_steps as set in your cofig file. In your case, every 15 minutes an evaluation session is performed. Also how often each evaluation is performed can also be configured in the config file.

5 Although you followed the tutorial above, but I suggest follow the official API documentation https://github.com/tensorflow/models/tree/master/research/object_detection.

PS: Indeed I can confirm the negative precision score is because of the absence of corresponding category. See reference in the cocoapi.

来源：https://stackoverflow.com/questions/55193486/tensorflow-object-detection-next-steps

标签

python

tensorflow

object-detection

object-detection-api