Using Kaggle Datasets in Google Colab

前端 未结 8 1710
南笙
南笙 2020-11-30 22:11

Is it possible to use any datasets available via the kaggle API in Google Colab? I see the Kaggle API is used in this Colab notebook, but it\'s a bit unclear to

相关标签:
8条回答
  • 2020-11-30 22:19

    Combined the top response to this Github gist as Colab Implementation. You can directly copy the code and use it.

    How to Import a Dataset from Kaggle in Colab

    Method:

    First a few things you have to do:

    1. Sign up for Kaggle
    2. Sign up for a competition you want to access data from (for example LANL-Earthquake-Prediction competition).
    3. Download your credentials to access Kaggle API as kaggle.json
    # Install kaggle packages
    !pip install -q kaggle
    !pip install -q kaggle-cli
    
    # Colab's file access feature
    from google.colab import files
    
    # Upload `kaggle.json` file
    uploaded = files.upload()
    
    # Retrieve uploaded file
    # print results
    for fn in uploaded.keys():
      print('User uploaded file "{name}" with length {length} bytes'.format(
          name=fn, length=len(uploaded[fn])))
    
    # Then copy kaggle.json into the folder where the API expects to find it.
    !mkdir -p ~/.kaggle
    !cp kaggle.json ~/.kaggle/
    !chmod 600 ~/.kaggle/kaggle.json
    !ls ~/.kaggle
    

    Now check if it worked!

    #list competitions
    !kaggle competitions list -s LANL-Earthquake-Prediction
    
    0 讨论(0)
  • 2020-11-30 22:20

    after the steps (1-6) above, to use dataset from a particular competition in colab, you can use the command:

    !kaggle competitions download -c elo-merchant-category-recommendation

    ( elo-merchant-category-recommendation is the name of the competition. )

    0 讨论(0)
  • 2020-11-30 22:22

    You should be able to access any dataset on Kaggle via the API. In this example, only the datasets for competitions are being listed. You can see that datasets you can access with this command:

    kaggle datasets list
    

    You can also search for datasets by adding the -s tag and then the search term you're interested in. So this would give you a list of datasets about dogs:

    kaggle datasets list -s dogs
    

    You can find more information on the API and how to use it in the documentation here.

    Hope that helps! :)

    0 讨论(0)
  • 2020-11-30 22:24

    Detailed approach:

    1. Go to my account in your profile

    1. Scroll down, until you find an option Create new Api Token, this will download a file called kaggle.json

    1. Go to Colab upload the file kaggle.json

    1. pip install kaggle

    1. create a new folder named kaggle, copy kaggle.json into the kaggle folder, and set read-write permissions only for you(user).

    6.Go to Kaggle website.For example, you want to download any data, click on the three dots in the right hand side of the screen. Then click copy API command

    1. Go to colab, paste the API command

    8.When you do an !ls, you will see that our download is a zip file.

    1. To unzip the file use the following command

    1. Now, when you do !ls you'll find our csv file is extracted from the zip file.

    1. To read the file perform a simple pd.read_csv, import pandas

    12.As you see, we have successfully read our file into colab.

    This downloads the kaggle dataset into google colab, where you can perform analysis and build amazing machine learning models or train neural networks.

    Happy Analysis!!!

    0 讨论(0)
  • 2020-11-30 22:28

    Have a look at this.

    It uses official kaggle api behind scene, but automates the process so you dont have to re-download manually every time your VM is taken away. Also, another issue i faced with using Kaggle API directly on Colab was the hassle of transferring Kaggle API token via Google Drive. Above method automates that as well.

    Disclaimer: I am one of the creators of Clouderizer.

    0 讨论(0)
  • 2020-11-30 22:40

    To download the competitve data on google colab from kaggle. I'm working on google colab and I've been through the same problem. but i did two tings .

    First you have to register your mobile number along with your country code. Second you have to click on last submission on the kaggle dataset page Then download kaggle.json file from kaggle.upload kaggle.json on the google colab After that on google colab run these code is given below.

    !pip install -q kaggle
    !mkdir -p ~/.kaggle
    !cp kaggle.json ~/.kaggle/ 
    !chmod 600 ~/.kaggle/kaggle.json 
    !kaggle competitions download -c web-traffic-time-series-forecasting
    
    0 讨论(0)
提交回复
热议问题