GCP can't write Biq query using by to_gbq

后端 未结 2 1662
孤街浪徒
孤街浪徒 2021-01-26 14:46

Can\'t write Biq Query by following Error.

Python 3.5.6 
pandas-gbq 0.13.1 
google-cloud-bigquery 1.24.0

ImportError: pandas-gbq

相关标签:
2条回答
  • 2021-01-26 15:29

    While you did not post the packages you imported and installed in your environment, this error is generally related to missing required packages.

    I was able to reproduce your case using Pandas and to_gbq method successfully without any error. For my attempt I used a Jupyter Notebook in a Cloud AI instance running Python 3.7.

    First, I installed the following packages in my environment:

    !pip install --upgrade google-bigquery[pandas] --quiet
    !pip install --upgrade pandas_gbq
    

    The second module (pandas_gbq) is necessary because it is not included in the google-bigquery[pandas] package, you can check the documentation here.

    Subsequently, within the python script it's necessary to import pandas and bigquery. I also created a dummy dataframe in order to reproduce the case. As following:

    import pandas as pd
    from google.cloud import bigquery
    
    records =[
        {
            "Name": "Alex",
            "Age": 25,
            "City":"New York"
        },
        {
            "Name": "Bryan",
            "Age": 27,
            "City":"San Francisco"
    
        }
    ]
    
    dataframe = pd.DataFrame(
        records,columns=["Name","Age","City"])
    
    print(dataframe)
    

    And the output:

        Name  Age           City
    0   Alex   25       New York
    1  Bryan   27  San Francisco
    

    Finally, I used the to_gbq method:

    #to_gbq
    dataframe.to_gbq('sample.pandas_bq_test',project_id="test-proj-261014",if_exists='append')
    

    It was well executed and the data frame was in BigQuery. Therefore, I encourage you to check above if you properly installed all the packages I used. In addition, you can use pip show <name_of_the_package> to check whether it is installed and its version.

    Update

    To use BigQuery and Pandas with DataLab, it is possible to use a virtual environment, read more about it here. So, it is assured all the necessary python dependencies are installed and no incompatibility is encountered.

    I followed the following steps to run the above code in DataLab.

    1. Create a DataLab instance and connection via http://localhost:8081/, following the documentation.
    2. Open a new notebook and select Kernel Python 3

    Run the below commands, where is the name of your virtual environment.

    !pip install virtualenv
    !virtualenv <your-env>
    !source <your-env>/bin/activate
    !<your-env>/bin/pip install google-cloud-bigquery
    

    Now you will be able to use import pandas as pd and from google.cloud import bigquery. I have tested with the code I provided above and it worked. Let me know if you have any issues.

    0 讨论(0)
  • 2021-01-26 15:52

    I tried !pip install google-cloud-bigquery==1.10.1 and this solved the error, it seems like proper for my Python version.

    0 讨论(0)
提交回复
热议问题