How to import data into google colab from google drive?

前端 未结 2 1440
既然无缘
既然无缘 2021-01-02 02:52

I have some data files uploaded on my google drive. I want to import those files into google colab.

The REST API method and PyDrive method show how to create a new f

相关标签:
2条回答
  • 2021-01-02 02:59

    (Update April 15 2018: The gspread is frequently being updated, so to ensure stable workflow I specify the version)

    For spreadsheet file, the basic idea is using packages gspread and pandas to read spreadsheets in Drive and convert them to pandas dataframe format.

    In the Colab notebook:

    #install packages
    !pip install gspread==2.1.1
    !pip install gspread-dataframe==2.1.0
    !pip install pandas==0.22.0
    
    
    #import packages and authorize connection to Google account:
    import pandas as pd
    import gspread
    from gspread_dataframe import get_as_dataframe, set_with_dataframe
    from google.colab import auth
    auth.authenticate_user()  # verify your account to read files which you have access to. Make sure you have permission to read the file!
    from oauth2client.client import GoogleCredentials
    gc = gspread.authorize(GoogleCredentials.get_application_default()) 
    

    Then I know 3 ways to read Google spreadsheets.

    By file name:

    spreadsheet = gc.open("goal.csv") # Open file using its name. Use this if the file is already anywhere in your drive
    sheet =  spreadsheet.get_worksheet(0)  # 0 means the first sheet in the file
    df2 = pd.DataFrame(sheet.get_all_records())
    df2.head()
    

    By url:

     spreadsheet = gc.open_by_url('https://docs.google.com/spreadsheets/d/1LCCzsUTqBEq5pemRNA9EGy62aaeIgye4XxwReYg1Pe4/edit#gid=509368585') # use this when you have the complete url (the edit#gid means permission)
        sheet =  spreadsheet.get_worksheet(0)  # 0 means the first sheet in the file
        df2 = pd.DataFrame(sheet.get_all_records())
        df2.head()
    

    By file key/ID:

    spreadsheet = gc.open_by_key('1vpukIbGZfK1IhCLFalBI3JT3aobySanJysv0k5A4oMg') # use this when you have the key (the string in the url following spreadsheet/d/)
    sheet =  spreadsheet.get_worksheet(0)  # 0 means the first sheet in the file
    df2 = pd.DataFrame(sheet.get_all_records())
    df2.head()
    

    I shared the code above in a Colab notebook: https://drive.google.com/file/d/1cvur-jpIpoEN3vAO8Fd_yVAT5Qgbr4GV/view?usp=sharing

    Source: https://github.com/burnash/gspread

    0 讨论(0)
  • 2021-01-02 03:20

    !) Set your data to be publicly available then for public spreadsheets:

    from StringIO import StringIO  # got moved to io in python3.
    
    import requests
    r = requests.get('https://docs.google.com/spreadsheet/ccc? 
    key=0Ak1ecr7i0wotdGJmTURJRnZLYlV3M2daNTRubTdwTXc&output=csv')
    data = r.content
    
    In [10]: df = pd.read_csv(StringIO(data), index_col=0,parse_dates= 
    ['Quradate'])
    
    In [11]: df.head()
    

    More here: Getting Google Spreadsheet CSV into A Pandas Dataframe

    If private data sort of the same but you will have to do some auth gymnastics...

    0 讨论(0)
提交回复
热议问题