Join tables using a value inside a JSONB column

心不动则不痛 提交于 2019-12-06 12:04:05

问题


There are two tables:

Authorized Contacts (auth_contacts):

(
userid varchar
contacts jsonb
)

contacts contains an array of contacts with attributes {contact_id, type}

discussion:

(
contact_id varchar
discussion_id varchar
discussion_details jsonb
)

The table auth_contacts has at least 100k records making it non JSONB type is not appropriate according as it would double or triple the amount of records.

Sample data for auth_contacts:

userid  | contacts
'11111' | '{"contact": [{"type": "type_a", "contact_id": "1-A-12"}
                      , {"type": "type_b", "contact_id": "1-A-13"}]}'

discussion table has 5 million odd records.

I want to join on discussion.contact_id (relational column) with contact id which a json object inside array of json objects in auth_contacts.contacts.

One very crude way is:

SELECT *
FROM discussion d 
JOIN (SELECT userid, JSONB_OBJECT_KEYS(a.contacts) AS auth_contact
      FROM auth_contacts a) AS contacts
      ON (d.contact_id = contacts.auth_contact::text)

What this does is actually at runtime create (inner sql) userid vs contact id table (Which is what I was avoiding and hence went for JSONB data type This query for a user with large records takes 26 + seconds which is not all good. Tried a few other ways: PostgreSQL 9.4: Aggregate / Join table on JSON field id inside array

But there should be a cleaner and better way which would be as simple as JOIN d.contact_id = contacts -> contact -> contact_id? When I try this, it doesn't yield any results.

When searching the net this seems to be a pretty cumbersome task?


回答1:


Proof of concept

Your "crude way" doesn't actually work. Here is another crude way that does:

SELECT *
FROM  auth_contacts a
    , jsonb_to_recordset(a.contacts->'contact') AS c(contact_id text)
JOIN  discussion d USING (contact_id);

As has been commented, you can also formulate a join condition with the contains operator @>:

SELECT *
FROM   auth_contacts a
JOIN   discussion d ON a.contacts->'contact'
                    @> json_build_array(json_build_object('contact_id', d.contact_id))::jsonb

But rather use JSON creation functions than string concatenation. Looks cumbersome but will actually be very fast if supported with a functional jsonb_path_ops GIN index:

CREATE INDEX auth_contacts_contacts_gin_idx ON auth_contacts
USING  gin ((contacts->'contact') jsonb_path_ops);

Details:

  • Index for finding an element in a JSON array
  • Postgres 9.4 jsonb array as table

Proper solution

This is all fascinating to play with, but the problem here is the relational model. Your claim:

hence making it non JSONB type is not appropriate according as it would double or triple the amount of records.

is the opposite of what's right. It's nonsense to wrap IDs you need for joining tables into a JSON document type. Normalize your table with a many-to-many relationship and implement all IDs you are working with inside the DB as separate columns with appropriate data type. Basics:

  • How to perform update operations on columns of type JSONB in Postgres 9.4
  • How to implement a many-to-many relationship in PostgreSQL?


来源:https://stackoverflow.com/questions/31309362/join-tables-using-a-value-inside-a-jsonb-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!