问题
I want to generate a dataframe with pandas read_sql from my sqlalchemy query with a PostgreSQL's jsonb attribute to columns.
Actually this will give me my answer:
query = session.query(
cls.id,
cls._my_jsonb_column
).all()
pd.DataFrame.from_dict([dict(id=id_, **i) for id_,i in query])
But I would prefer to unpack the jsonb with PostgreSQL instead of in the application.
My attempt gives
query = session.query(
cls.id,
func.jsonb_to_record(cls._my_jsonb_column)
)
pd.read_sql(query.statement, query.session.bind)
(psycopg2.NotSupportedError) function returning record called in context that cannot accept type record
回答1:
json_to_record
(and jsonb_to_recordset
) returns record as if it was the result of a SELECT
query. In sqlalchemy context, it provides a selectable which can be used like a table.
So you should consider the result of func.jsonb_to_record(cls._my_jsonb_column)
as a sort of table you can join to you your original table.
That is you query should look like:
jsonb_data = func.jsonb_to_record(cls._my_jsonb_column)
query = session.query(
select(
[cls.id, <other columns>]
).select_from(
cls.join(jsonb_data, <on_clause>)
)
)
You could even flatten your JSON data using JSON processing functions but it is not possible to be more precise without knowing the structure of the JSON data.
Alternatively, I recently released a package to easily flatten JSONB fields from a description of the json data, I would be happy to have some feedbacks: pg_jsonb_flattener
来源:https://stackoverflow.com/questions/48582897/sqlalchemy-pandas-read-sql-from-jsonb