I am currently facing the problems of dealing with a large dataset, I can not download the dataset directly into google colab due to the limited space google colab provides(37 G
One more approach could be uploading just the annotations file to Google Colab. There's no need to download the image dataset. We will make use of the PyCoco API. Next, when preparing an image, instead of accessing the image file from Drive / local folder, you can read the image file with the URL!
# The normal method. Read from folder / Drive
I = io.imread('%s/images/%s/%s'%(dataDir,dataType,img['file_name']))
# Instead, use this! Url to load image
I = io.imread(img['coco_url'])
This method will save you plenty of space, download time and effort. However, you'll require a working internet connection during training to fetch the images (which of course you have, since you are using colab).
If you are interested in exploring the COCO dataset more, you can have a look at my post on medium.