Create/Insert Json in Postgres with requests and psycopg2

前端未结

关注

 1  1303

Just started a project with PostgreSQL. I would like to make the leap from Excel to a database and I am stuck on create and insert. Once I run this I will have to s

相关标签:

1条回答

故里飘歌

2021-02-04 23:22

It seems like you want to create a table with one column named "data". The type of this column is JSON. (I would recommend creating one column per field, but it's up to you.)

In this case the variable data (that is read from the request) is a list of dicts. As I mentioned in my comment, you can loop over data and do the inserts one at a time as executemany() is not faster than multiple calls to execute().

What I did was the following:

Create a list of fields that you care about.
Loop over the elements of data
For each item in data, extract the fields into my_data
Call execute() and pass in json.dumps(my_data) (Converts my_data from a dict into a JSON-string)

Try this:

#!/usr/bin/env python
import requests
import psycopg2
import json

conn = psycopg2.connect(database='NHL', user='postgres', password='postgres', host='localhost', port='5432')

req = requests.get('http://www.nhl.com/stats/rest/skaters?isAggregate=false&reportType=basic&isGame=false&reportName=skatersummary&sort=[{%22property%22:%22playerName%22,%22direction%22:%22ASC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]&cayenneExp=gameTypeId=2%20and%20seasonId%3E=20172018%20and%20seasonId%3C=20172018') 

# data here is a list of dicts
data = req.json()['data']

cur = conn.cursor()
# create a table with one column of type JSON
cur.execute("CREATE TABLE t_skaters (data json);")

fields = [
    'seasonId',
    'playerName',
    'playerFirstName',
    'playerLastName',
    'playerId',
    'playerHeight',
    'playerPositionCode',
    'playerShootsCatches',
    'playerBirthCity',
    'playerBirthCountry',
    'playerBirthStateProvince',
    'playerBirthDate',
    'playerDraftYear',
    'playerDraftRoundNo',
    'playerDraftOverallPickNo'
]

for item in data:
    my_data = {field: item[field] for field in fields}
    cur.execute("INSERT INTO t_skaters VALUES (%s)", (json.dumps(my_data),))


# commit changes
conn.commit()
# Close the connection
conn.close()

I am not 100% sure if all of the postgres syntax is correct here (I don't have access to a PG database to test), but I believe that this logic should work for what you are trying to do.

Update For Separate Columns

You can modify your create statement to handle multiple columns, but it would require knowing the data type of each column. Here's some psuedocode you can follow:

# same boilerplate code from above
cur = conn.cursor()
# create a table with one column per field
cur.execute(
"""CREATE TABLE t_skaters (seasonId INTEGER, playerName VARCHAR, ...);"""
)

fields = [
    'seasonId',
    'playerName',
    'playerFirstName',
    'playerLastName',
    'playerId',
    'playerHeight',
    'playerPositionCode',
    'playerShootsCatches',
    'playerBirthCity',
    'playerBirthCountry',
    'playerBirthStateProvince',
    'playerBirthDate',
    'playerDraftYear',
    'playerDraftRoundNo',
    'playerDraftOverallPickNo'
]

for item in data:
    my_data = [item[field] for field in fields]
    # need a placeholder (%s) for each variable 
    # refer to postgres docs on INSERT statement on how to specify order
    cur.execute("INSERT INTO t_skaters VALUES (%s, %s, ...)", tuple(my_data))


# commit changes
conn.commit()
# Close the connection
conn.close()

Replace the ... with the appropriate values for your data.

0 讨论(0)