Best strategy to reduce false positives: Google's new Object Detection API on Satellite Imagery

前端 未结 2 1469
情深已故
情深已故 2021-01-31 09:44

I\'m setting up the new Tensorflow Object Detection API to find small objects in large areas of satellite imagery. It works quite well - it finds all 10 objects I want, but I al

2条回答
  •  挽巷
    挽巷 (楼主)
    2021-01-31 10:19

    I think I was passing through the same or close scenario and it's worth it to share with you.

    I managed to solve it by passing images without annotations to the trainer.

    On my scenario I'm building a project to detect assembly failures from my client's products, at real time. I successfully achieved very robust results (for production env) by using detection+classification for components that has explicity a negative pattern (e.g. a screw that has screw on/off(just the hole)) and only detection for things that doesn't has the negative pattens (e.g. a tape that can be placed anywhere).

    On the system it's mandatory that the user record 2 videos, one containing the positive scenario and another containing the negative (or the n videos, containing n patterns of positive and negative so the algorithm can generalize).

    After a while testing I found out that if I register to detected only tape the detector was giving very confident (0.999) false positive detections of tape. It was learning the pattern where the tape was inserted instead of the tape itself. When I had another component (like a screw on it's negative format) I was passing the negative pattern of tape without being explicitly aware of it, so the FPs didn't happen.

    So I found out that, in this scenario, I had to necessarily pass the images without tape so it could differentiate between tape and no-tape.

    I considered two alternatives to experiment and try to solve this behavior:

    1. Train passing an considerable amount of images that doesn't has any annotation (10% of all my negative samples) along with all images that I have real annotations.
    2. On the images that I don't have annotation I create a dummy annotation with a dummy label so I could force the detector to train with that image (thus learning the no-tape patttern). Later on, when get the dummy predictions, just ignore them.

    Concluded that both alternatives worked perfectly on my scenario. The training loss got a little messy but the predictions work with robustness for my very controlled scenario (the system's camera has its own box and illumination to decrease variables).

    I had to make two little modifications for the first alternative to work:

    1. All images that didn't had any annotation I passed a dummy annotation (class=None, xmin/ymin/xmax/ymax=-1)
    2. When generating the tfrecord files I use this information (xmin == -1, in this case) to add an empty list for the sample:
    def create_tf_example(group, path, label_map):
        with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
            encoded_jpg = fid.read()
        encoded_jpg_io = io.BytesIO(encoded_jpg)
        image = Image.open(encoded_jpg_io)
        width, height = image.size
    
        filename = group.filename.encode('utf8')
        image_format = b'jpg'
    
        xmins = []
        xmaxs = []
        ymins = []
        ymaxs = []
        classes_text = []
        classes = []
    
        for index, row in group.object.iterrows():
            if not pd.isnull(row.xmin):
                if not row.xmin == -1:
                    xmins.append(row['xmin'] / width)
                    xmaxs.append(row['xmax'] / width)
                    ymins.append(row['ymin'] / height)
                    ymaxs.append(row['ymax'] / height)
                    classes_text.append(row['class'].encode('utf8'))
                    classes.append(label_map[row['class']])
    
        tf_example = tf.train.Example(features=tf.train.Features(feature={
            'image/height': dataset_util.int64_feature(height),
            'image/width': dataset_util.int64_feature(width),
            'image/filename': dataset_util.bytes_feature(filename),
            'image/source_id': dataset_util.bytes_feature(filename),
            'image/encoded': dataset_util.bytes_feature(encoded_jpg),
            'image/format': dataset_util.bytes_feature(image_format),
            'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
            'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
            'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
            'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
            'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
            'image/object/class/label': dataset_util.int64_list_feature(classes),
        }))
        return tf_example
    

    Part of the traning progress:

    Currently I'm using tensorflow object detection along with tensorflow==1.15, using faster_rcnn_resnet101_coco.config.

    Hope it will solve someone's problem as I didn't found any solution on the internet. I read a lot of people telling that faster_rcnn is not adapted for negative training for FPs reduction but my tests proved the opposite.

提交回复
热议问题