I already have a zip of (2K images) dataset on a google drive. I have to use it in a ML training algorithm. Below Code extracts the content in a string format:
First create a new directory:
!mkdir file_destination
Now, it's the time to inflate the directory with the unzipped files with this:
!unzip file_location -d file_destination
After mounting on drive, use shutil.unpack_archive. It works with almost all archive formats (e.g., “zip”, “tar”, “gztar”, “bztar”, “xztar”) and it's simple:
import shutil
shutil.unpack_archive("filename", "path_to_extract")
For Python
Connect to drive,
from google.colab import drive
drive.mount('/content/drive')
Check for directory
!ls
and !pwd
For unzip
!unzip drive/"My Drive"/images.zip
First, install unzip on colab:
!apt install unzip
then use unzip to extract your files:
!unzip source.zip -d destination.zip
TO unzip a file to a directory:
!unzip path_to_file.zip -d path_to_directory
Colab research team has a notebook for helping you out.
Still, in short, if you are dealing with a zip file, like for me it is mostly thousands of images and I want to store them in a folder within drive then do this --
!unzip -u "/content/drive/My Drive/folder/example.zip" -d "/content/drive/My Drive/folder/NewFolder"
-u
part controls extraction only if new/necessary. It is important if suddenly you lose connection or hardware switches off.
-d
creates the directory and extracted files are stored there.
Of course before doing this you need to mount your drive
from google.colab import drive
drive.mount('/content/drive')
I hope this helps! Cheers!!