How can I download a specific part of Coco Dataset?

后端 未结 2 1586
礼貌的吻别
礼貌的吻别 2020-12-06 07:57

I am developing an object detection model to detect ships using YOLO. I want to use the COCO dataset. Is there a way to download only the images that have ships with the ann

相关标签:
2条回答
  • 2020-12-06 08:36

    From what I personally know, if you're talking about the COCO dataset only, I don't think they have a category for "ships". The closest category they have is "boat". Here's the link to check the available categories: http://cocodataset.org/#overview

    BTW, there are ships inside the boat category too.

    If you want to just select images of a specific COCO category, you might want to do something like this (taken and edited from COCO's official demos):

    # display COCO categories
    cats = coco.loadCats(coco.getCatIds())
    nms=[cat['name'] for cat in cats]
    print('COCO categories: \n{}\n'.format(' '.join(nms)))
    
    # get all images containing given categories (I'm selecting the "bird")
    catIds = coco.getCatIds(catNms=['bird']);
    imgIds = coco.getImgIds(catIds=catIds);
    
    0 讨论(0)
  • 2020-12-06 08:53

    To download images from a specific category, you can use the COCO API. Here's a demo notebook going through this and other usages. The overall process is as follows:

    • Install pycocotools
    • Download one of the annotations jsons from the COCO dataset

    Now here's an example on how we could download a subset of the images containing a person and saving it in a local file:

    from pycocotools.coco import COCO
    import requests
    
    # instantiate COCO specifying the annotations json path
    coco = COCO('...path_to_annotations/instances_train2014.json')
    # Specify a list of category names of interest
    catIds = coco.getCatIds(catNms=['person'])
    # Get the corresponding image ids and images using loadImgs
    imgIds = coco.getImgIds(catIds=catIds)
    images = coco.loadImgs(imgIds)
    

    Which returns a list of dictionaries with basic information on the images and its url. We can now use requests to GET the images and write them into a local folder:

    # Save the images into a local folder
    for im in images:
        img_data = requests.get(im['coco_url']).content
        with open('...path_saved_ims/coco_person/' + im['file_name'], 'wb') as handler:
            handler.write(img_data)
    

    Note that this will save all images from the specified category. So you might want to slice the images list to the first n.

    0 讨论(0)
提交回复
热议问题