Is there a way to access data from one drive using google colab?

柔情痞子 提交于 2020-02-28 06:58:22


I have started using google colab to train neural networks, however the data I have is quite large (4GB and 18GB). I have all this data currently stored in one drive and I don't have enough space on my google drive to transfer these files over.

Is there a way for me to directly access the data from one drive in google colab?

I have tried directly loading the data from my own machine, however I feel this process is too time consuming and my machine really doesn't have enough space to store these files. I have also tried adding download=1 after the ? in the file's hyperlink however this does not download and only displays the hyperlink. While using wget produces a 'ERROR 403: Forbidden.' message.

I would like for the google colab file to download this zipped file and to unzip the data from it in order to preform training.


You can use OneDriveSDK which available for download in the PyPi index.

First, we will install it in Google Colab using :

!pip install onedrivesdk

The process is too long to be accommodated here. You need to first authenticate yourself and then you can upload/download files easily.

You can authenticate using this code:

import onedrivesdk 

redirect_uri = 'http://localhost:8080/' client_secret = 'your_client_secret' client_id='your_client_id' api_base_url='' 
scopes=['wl.signin', 'wl.offline_access', 'onedrive.readwrite'] 
http_provider = onedrivesdk.HttpProvider() 
auth_provider = onedrivesdk.AuthProvider( http_provider=http_provider, client_id=client_id, scopes=scopes) 
client = onedrivesdk.OneDriveClient(api_base_url, auth_provider, http_provider) 
auth_url = client.auth_provider.get_auth_url(redirect_uri) 

# Ask for the code 
print('Paste this URL into your browser, approve the app\'s access.') 
print('Copy everything in the address bar after "code=", and paste it below.') print(auth_url) 
code = input('Paste code here: ')  client.auth_provider.authenticate(code, redirect_uri, client_secret)

This will result in a code which you need to paste in your browser and again in the console to authenticate yourself.

You can download an file using :

root_folder = client.item(drive='me', id='root').children.get() 
id_of_file = root_folder[0].id client.item(drive='me', id=id_of_file).download('./path_to_file')

