Multiple relations parent/child with multiple levels

问题

I have a MySQL table named companies like this:

+---------+-----------+-----------+
| id_comp | comp_name | id_parent |
+---------+-----------+-----------+
|       1 | comp1     |      NULL |
|       2 | comp2     |         1 |
|       3 | comp3     |         2 |
|       4 | comp4     |         2 |
|       5 | comp5     |         2 |
|       6 | comp6     |         1 |
|       3 | comp3     |         6 |
|       5 | comp5     |         6 |
|       7 | comp7     |         6 |
|       4 | comp4     |         6 |
|       8 | comp8     |         4 |
+---------+-----------+-----------+

Each company may have multiple parents (ex: comp3, which is child of comp2 and comp6), each parent may have multiple childs and each child can be a parent itself of multiple childs and so on... So, it can have unlimited levels (relations).

I researched several solutions (http://www.codeproject.com/Articles/818694/SQL-queries-to-manage-hierarchical-or-parent-child, http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/), but I don't think it fits my the problem since the same company (based on id_comp column) can have multiple parents.

I have two questions regarding this:

Is this the right approach if I have thousands of relations (scalable) ?
How do I, given a name (which is unique, based on id_comp) query to select its brothers (same parent_id), its direct parent(s), and its direct child(s).

回答1:

Mysql isn't the best choice if you need to work with hierarchical data (getting all ancestors/descendants can be tricky). But if all you care about is finding direct parents/children, your table should be fine (although I might break it out into separate Company and CompanyParent tables so that the company name isn't entered multiple times).

This would give you brothers:

select name
from companies 
where id_parent in (select id_parent from companies where id_comp = @company_id)
and id_comp <> @company_id
group by name;

This would give you direct parents:

select p.name
from companies p
join companies c on p.id = c.id_parent
where c.id_comp = @company_id
group by c.name;

This would give you direct children:

select c.name
from companies p
join companies c on p.id = c.id_parent
where p.id_comp = @company_id
group by c.name;

回答2:

You have a simple "many:many" relationship. However, you have a restriction that is not really relevant (nor checkable) in that there are no loops.

CREATE TABLE Relations (
    id_comp ...,
    id_parent ...,
    PRIMARY KEY(id_comp, id_parent),  -- for reaching "up"
    INDEX(id_parent, id_comp)         -- for reaching "down"
) ENGINE=InnoDB;

This will scale to millions, probably billions, of relations. Since a PRIMARY KEY is, by definition, UNIQUE and an INDEX, it prevents duplicate relations (1 is a parent of 2 only once) and provides an efficient way to traverse one direction.

Use DISTINCT instead of GROUP BY when necessary. Do not use IN ( SELECT ...), it tends to be slow.

My Siblings:

SELECT DISTINCT their_kids.*
    FROM Relations AS me
    JOIN Relations AS my_parents  ON my_parents.id_comp = me.id_parent
    JOIN Relations AS their_kids  ON their_kids.id_parent = parents.id_comp
    WHERE         me.id_comp = @me
      AND their_kids.id_comp != @me;

My (immediate) Parents:

SELECT my_parents.*
    FROM Relations AS me
    JOIN Relations AS my_parents  ON my_parents.id_comp = me.id_parent
    WHERE me.id_comp = @me;

My (immediate) Children:

SELECT my_kids.*
    FROM Relations AS me
    JOIN Relations AS my_kids  ON my_kids.id_parent = me.id_comp
    WHERE me.id_comp = @me;

Aunts, uncles, first cousins would be a bit messier. All ancestors or descendants would be much messier, and should be done with a loop in application code or a Stored Procedure.

来源：https://stackoverflow.com/questions/43892661/multiple-relations-parent-child-with-multiple-levels

标签

mysql

sql

performance

relational-database

relationships