Adjacency List to JSON graph with Postgres

问题

I have the following schema for the tags table:

CREATE TABLE tags (
    id integer NOT NULL,
    name character varying(255) NOT NULL,
    parent_id integer
);

I need to build a query to return the following structure (here represented as yaml for readability):

- name: Ciencia
  parent_id: 
  id: 7
  children:
  - name: Química
    parent_id: 7
    id: 9
    children: []
  - name: Biología
    parent_id: 7
    id: 8
    children:
    - name: Botánica
      parent_id: 8
      id: 19
      children: []
    - name: Etología
      parent_id: 8
      id: 18
      children: []

After some trial and error and looking for similar questions in SO, I've came up with this query:

    WITH RECURSIVE tagtree AS (
      SELECT tags.name, tags.parent_id, tags.id, json '[]' children
      FROM tags
      WHERE NOT EXISTS (SELECT 1 FROM tags tt WHERE tt.parent_id = tags.id)

      UNION ALL

      SELECT (tags).name, (tags).parent_id, (tags).id, array_to_json(array_agg(tagtree)) children FROM (
        SELECT tags, tagtree
        FROM tagtree
        JOIN tags ON tagtree.parent_id = tags.id
      ) v
      GROUP BY v.tags
    )

    SELECT array_to_json(array_agg(tagtree)) json
    FROM tagtree
    WHERE parent_id IS NULL

But it returns the following results when converted to yaml:

- name: Ciencia
  parent_id: 
  id: 7
  children:
  - name: Química
    parent_id: 7
    id: 9
    children: []
- name: Ciencia
  parent_id: 
  id: 7
  children:
  - name: Biología
    parent_id: 7
    id: 8
    children:
    - name: Botánica
      parent_id: 8
      id: 19
      children: []
    - name: Etología
      parent_id: 8
      id: 18
      children: []

The root node is duplicated. I could merge the results to the expected result in my app code but I feel I am close and it could be done al from PG.

Here's an example with SQL Fiddle: http://sqlfiddle.com/#!15/1846e/1/0

Expected output: https://gist.github.com/maca/e7002eb10f36fcdbc51b

Actual output: https://gist.github.com/maca/78e84fb7c05ff23f07f4

回答1:

Here's a solution using PLV8 for your schema.

First, build a materialized path using PLSQL function and recursive CTEs.

CREATE OR REPLACE FUNCTION get_children(tag_id integer)
RETURNS json AS $$
DECLARE
result json;
BEGIN
SELECT array_to_json(array_agg(row_to_json(t))) INTO result
    FROM (
WITH RECURSIVE tree AS (
  SELECT id, name, ARRAY[]::INTEGER[] AS ancestors
  FROM tags WHERE parent_id IS NULL

  UNION ALL

  SELECT tags.id, tags.name, tree.ancestors || tags.parent_id
  FROM tags, tree
  WHERE tags.parent_id = tree.id
) SELECT id, name, ARRAY[]::INTEGER[] AS children FROM tree WHERE $1 = tree.ancestors[array_upper(tree.ancestors,1)]
) t;
RETURN result;
END;
$$ LANGUAGE plpgsql;

Then, build the tree from the output of the above function.

CREATE OR REPLACE FUNCTION get_tree(data json) RETURNS json AS $$

var root = [];

for(var i in data) {
  build_tree(data[i]['id'], data[i]['name'], data[i]['children']);
}

function build_tree(id, name, children) {
  var exists = getObject(root, id);
  if(exists) {
       exists['children'] = children;
  }
  else {
    root.push({'id': id, 'name': name, 'children': children});
  }
}


function getObject(theObject, id) {
    var result = null;
    if(theObject instanceof Array) {
        for(var i = 0; i < theObject.length; i++) {
            result = getObject(theObject[i], id);
            if (result) {
                break;
            }   
        }
    }
    else
    {
        for(var prop in theObject) {
            if(prop == 'id') {
                if(theObject[prop] === id) {
                    return theObject;
                }
            }
            if(theObject[prop] instanceof Object || theObject[prop] instanceof Array) {
                result = getObject(theObject[prop], id);
                if (result) {
                    break;
                }
            } 
        }
    }
    return result;
}

    return JSON.stringify(root);
$$ LANGUAGE plv8 IMMUTABLE STRICT;

This will yield the required JSON mentioned in your question. Hope that helps.

I've written a detailed post/breakdown of how this solution works here.

回答2:

Try PL/Python and networkx.

Admittedly, using the following doesn't yield JSON in exactly the requested format, but the information seems to be all there and, if PL/Python is acceptable, this might be adapted into a complete answer.

CREATE OR REPLACE FUNCTION get_adjacency_data(
    names text[],
    ids integer[],
    parent_ids integer[])
  RETURNS jsonb AS
$BODY$

    pairs = zip(ids, parent_ids)

    import networkx as nx
    import json
    from networkx.readwrite import json_graph

    name_dict = dict(zip(ids, names))

    G=nx.DiGraph()
    G.add_nodes_from(ids)
    nx.set_node_attributes(G, 'name', name_dict)
    G.add_edges_from(pairs)
    return json.dumps(json_graph.adjacency_data(G))

$BODY$ LANGUAGE plpythonu;

WITH raw_data AS (
    SELECT array_agg(name) AS names,
        array_agg(parent_id) AS parent_ids,
        array_agg(id) AS ids
    FROM tags
    WHERE parent_id IS NOT NULL)
SELECT get_adjacency_data(names, parent_ids, ids)
FROM raw_data;

回答3:

i was finding same solution and may be this example could be useful for anyone

tested on Postgres 10 with table with same structure

table with columns: id, name and pid as parent_id


create or replace function get_c_tree(p_parent int8) returns setof jsonb as $$

  select
    case 
      when count(x) > 0 then jsonb_build_object('id', c.id, 'name', c.name,  'children', jsonb_agg(f.x))
      else jsonb_build_object('id', c.id, 'name', c.name, 'children', null)
    end
  from company c left join get_c_tree(c.id) as f(x) on true
  where c.pid = p_parent or (p_parent is null and c.pid is null)
  group by c.id, c.name;

$$ language sql;


select jsonb_agg(get_c_tree) from get_c_tree(null::int8);

来源：https://stackoverflow.com/questions/27438704/adjacency-list-to-json-graph-with-postgres

标签

json

postgresql

adjacency-list