问题
I have a TEXT column in my PostgreSQL (9.6) database containing a list of one or more dictionnaries, like those ones.
[{"line_total_excl_vat": "583.3300", "account": "", "subtitle": "", "product_id": 5532548, "price_per_unit": "583.3333", "line_total_incl_vat": "700.0000", "text": "PROD0008", "amount": "1.0000", "vat_rate": "20"}]
or
[{"line_total_excl_vat": "500.0000", "account": "", "subtitle": "", "product_id": "", "price_per_unit": "250.0000", "line_total_incl_vat": "600.0000", "text": "PROD003", "amount": "2.0000", "vat_rate": "20"}, {"line_total_excl_vat": "250.0000", "account": "", "subtitle": "", "product_id": 5532632, "price_per_unit": "250.0000", "line_total_incl_vat": "300.0000", "text": "PROD005", "amount": "1.0000", "vat_rate": "20"}]
I would like to retrieve each dictionnary from the column and parse them in different columns.
For this example:
id | customer | blurb
---+----------+------
1 | Joe | [{"line_total_excl_vat": "583.3300", "account": "", "subtitle": "", "product_id": 5532548, "price_per_unit": "583.3333", "line_total_incl_vat": "700.0000", "text": "PROD0008", "amount": "1.0000", "vat_rate": "20"}]
2 | Sally | [{"line_total_excl_vat": "500.0000", "account": "", "subtitle": "", "product_id": "", "price_per_unit": "250.0000", "line_total_incl_vat": "600.0000", "text": "PROD003", "amount": "2.0000", "vat_rate": "20"}, {"line_total_excl_vat": "250.0000", "account": "", "subtitle": "", "product_id": 5532632, "price_per_unit": "250.0000", "line_total_incl_vat": "300.0000", "text": "PROD005", "amount": "1.0000", "vat_rate": "20"}]
would become:
id | customer | line_total_excl_vat | account | product_id | ...
---+----------+----------------------+---------+------------
1 | Joe | 583.3300 | null| 5532548
2 | Sally | 500.0000 | null| null
3 | Sally | 250.0000 | null| 5532632
回答1:
if you know beforehand what fields you want to extract, cast the text to json / jsonb & use json_to_recordset
/ jsonb_to_recordset
. Note that this method requires the fields names / types to be explicitly be specified. Unspecified fields that are in the json dictionaries will not be extracted.
See official postgesql documentation on json-functions
self contained example:
WITH tbl (id, customer, dat) as ( values
(1, 'Joe',
'[{ "line_total_excl_vat": "583.3300"
, "account": ""
, "subtitle": ""
, "product_id": 5532548
, "price_per_unit": "583.3333"
, "line_total_incl_vat": "700.0000"
, "text": "PROD0008"
, "amount": "1.0000"
, "vat_rate": "20"}]')
,(2, 'Sally',
'[{ "line_total_excl_vat": "500.0000"
, "account": ""
, "subtitle": ""
, "product_id": ""
, "price_per_unit": "250.0000"
, "line_total_incl_vat": "600.0000"
, "text": "PROD003"
, "amount": "2.0000"
, "vat_rate": "20"}
, { "line_total_excl_vat": "250.0000"
, "account": ""
, "subtitle": ""
, "product_id": 5532632
, "price_per_unit": "250.0000"
, "line_total_incl_vat": "300.0000"
, "text": "PROD005"
, "amount": "1.0000"
, "vat_rate": "20"}]')
)
SELECT id, customer, x.*
FROM tbl
, json_to_recordset(dat::json) x
( line_total_excl_vat numeric
, acount text
, subtitle text
, product_id text
, price_per_unit numeric
, line_total_incl_vat numeric
, "text" text
, amount numeric
, vat_rate numeric
)
produces the following output:
id customer line_total_excl_vat acount subtitle product_id price_per_unit line_total_incl_vat text amount vat_rate
1 Joe 583.33 5532548 583.3333 700 PROD0008 1 20
2 Sally 500 250 600 PROD003 2 20
2 Sally 250 5532632 250 300 PROD005 1 20
This format is often referred to as the wide format.
It is also possible to extract the data in a long format, which has the additional benefit that it keeps all the data without explicitly mentioning the field names. In this case, the query may be written as (the test data is elided for brevity)
SELECT id, customer, y.key, y.value, x.record_number
FROM tbl
, lateral json_array_elements(dat::json) WITH ORDINALITY AS x (val, record_number)
, lateral json_each_text(x.val) y
The with ordinality
in the above statement adds a sequence number for each element in the unnested array, and is be used to disambiguate fields from different arrays for each customer.
This produced the output:
id customer key value record_number
1 Joe line_total_excl_vat 583.3300 1
1 Joe account 1
1 Joe subtitle 1
1 Joe product_id 5532548 1
1 Joe price_per_unit 583.3333 1
1 Joe line_total_incl_vat 700.0000 1
1 Joe text PROD0008 1
1 Joe amount 1.0000 1
1 Joe vat_rate 20 1
2 Sally line_total_excl_vat 500.0000 1
2 Sally account 1
2 Sally subtitle 1
2 Sally product_id 1
2 Sally price_per_unit 250.0000 1
2 Sally line_total_incl_vat 600.0000 1
2 Sally text PROD003 1
2 Sally amount 2.0000 1
2 Sally vat_rate 20 1
2 Sally line_total_excl_vat 250.0000 2
2 Sally account 2
2 Sally subtitle 2
2 Sally product_id 5532632 2
2 Sally price_per_unit 250.0000 2
2 Sally line_total_incl_vat 300.0000 2
2 Sally text PROD005 2
2 Sally amount 1.0000 2
2 Sally vat_rate 20 2
回答2:
Tidying up the json field would help a little bit. And that's something it could be done before inserting data into the table.
However, following your example, the code below should work:
create table public.yourtable (id integer, name varchar, others varchar);
insert into public.yourtable select 1,'Joe','[{"line_total_excl_vat": "583.3300", "account": "", "subtitle": "", "product_id": 5532548, "price_per_unit": "583.3333", "line_total_incl_vat": "700.0000", "text": "PROD0008", "amount": "1.0000", "vat_rate": "20"}]';
insert into public.yourtable select 2,'Sally','[{"line_total_excl_vat": "500.0000", "account": "", "subtitle": "", "product_id": "", "price_per_unit": "250.0000", "line_total_incl_vat": "600.0000", "text": "PROD003", "amount": "2.0000", "vat_rate": "20"}, {"line_total_excl_vat": "250.0000", "account": "", "subtitle": "", "product_id": 5532632, "price_per_unit": "250.0000", "line_total_incl_vat": "300.0000", "text": "PROD005", "amount": "1.0000", "vat_rate": "20"}]';
with jsonb_table as (
select id, name,
('{'||regexp_replace(
unnest(string_to_array(others, '}, {')),
'\[|\]|\{|\}','','g')::varchar||'}')::jsonb as jsonb_data
from yourtable
)
select id,name, * from jsonb_table,
jsonb_to_record(jsonb_data)
as (line_total_excl_vat numeric,account varchar, subtitle varchar, product_id varchar, price_per_unit numeric, line_total_incl_vat numeric);
First we create the jsonb_table where we transform your dictionary field into a postgres jsonb field by:
1) converting the string into an array by splitting in the '}, {' character sequence
2) unnesting the array elements to rows
3) cleanning up '[]{}' characters and converting the string to jsonb
And then we make use of the jsonb_to_record function to convert the jsonb records into columns. There we have to specify as many fields as needed for the column definitions.
来源:https://stackoverflow.com/questions/51045754/unnesting-a-list-of-json-objects-in-postgresql