I have a SQL table like so:
Update: I'm changing the example table as the existing hierarchical nature of the original data (State, Cities, Schools) is overshadowing the fact that a simple relationship is needed between the items.
entities
id name
1 Apple
2 Orange
3 Banana
4 Carrot
5 Mushroom
I want to define two-way relationships between these entities so a user viewing one entity can see a list of all related entities.
The relationships are defined by an end user.
What is the best way to represent these relationships in the database and subsequently query and update them?
One way as I see it...
My instinct says a relationship table like so:
entity_entity
entity_id_a entity_id_b
1 2
5 1
4 1
5 4
1 3
That being the case, given a supplied entity_id of 4, how would one get all related records, which would be 1 and 5?
Likewise a query of entity_id = 1 should return 2, 3, 4, and 5.
Thanks for your time and let me know if I can clarify the question at all.
Define a constraint: entity_id_a < entity_id_b
.
Create indexes:
CREATE UNIQUE INDEX ix_a_b ON entity_entity(entity_id_a, entity_id_b);
CREATE INDEX ix_b ON entity_entity(entity_id_b);
Second index doesn't need to include entity_id_a
as you will use it only to select all a
's within one b
. RANGE SCAN
on ix_b
will be faster than a SKIP SCAN
on ix_a_b
.
Populate the table with your entities as follows:
INSERT
INTO entity_entity (entity_id_a, entity_id_b)
VALUES (LEAST(@id1, @id2), GREATEST(@id1, @id2))
Then select:
SELECT entity_id_b
FROM entity_entity
WHERE entity_id_a = @id
UNION ALL
SELECT entity_id_a
FROM entity_entity
WHERE entity_id_b = @id
UNION ALL
here lets you use above indexes and avoid extra sorting for uniqueness.
All above is valid for a symmetric and anti-reflexive relationship. That means that:
If a is related to b, then b is related to a
a is never related to a
I think the structure you have suggested is fine.
To get the related records do something like
SELECT related.* FROM entities AS search
LEFT JOIN entity_entity map ON map.entity_id_a = search.id
LEFT JOIN entities AS related ON map.entity_id_b = related.id
WHERE search.name = 'Search term'
Hope that helps.
The link table approach seems fine, except that you might want a 'relationship type' so that you know WHY they are related.
For example, the relation between Raleigh and North Carolina is not the same as a relation between Raleigh and Durham. Additionally, you may want to know who is the 'parent' in the relationship, in case you were driving conditional drop-downs. (i.e. You select a State, you get to see the cities that are in the state).
Depending on the complexity of your requirements, the simple setup you have right now may not be sufficient. If you simply need to show that two records are related in some way, the link table should be sufficient.
I already posted a way to do it in your design, but I also wanted to offer this separate design insight if you have some flexibility in your design and this more closely fits your needs.
If the items are in (non-overlapping) equivalence classes, you might want to make equivalence classes the basis for the table design, where everything in class is considered equivalent. The classes themselves can be anonymous:
CREATE TABLE equivalence_class (
class_id int -- surrogate, IDENTITY, autonumber, etc.
,entity_id int
)
entity_id
should be unique for a non-overlapping partition of your space.
This avoids the problem of ensuring proper left- or right-handed-ness or forcing an upper-right relationship matrix.
Then your query is a little different:
SELECT c2.entity_id
FROM equivalence_class c1
INNER JOIN equivalence_class c2
ON c1.entity_id = @entity_id
AND c1.class_id = c2.class_id
AND c2.entity_id <> @entity_id
or, equivalently:
SELECT c2.entity_id
FROM equivalence_class c1
INNER JOIN equivalence_class c2
ON c1.entity_id = @entity_id
AND c1.class_id = c2.class_id
AND c2.entity_id <> c1.entity_id
I can think of a few ways.
A single pass with a CASE:
SELECT DISTINCT
CASE
WHEN entity_id_a <> @entity_id THEN entity_id_a
WHEN entity_id_b <> @entity_id THEN entity_id_b
END AS equivalent_entity
FROM entity_entity
WHERE entity_id_a = @entity_id OR entity_id_b = @entity_id
Or two filtered queries UNIONed thus:
SELECT entity_id_b AS equivalent_entity
FROM entity_entity
WHERE entity_id_a = @entity_id
UNION
SELECT entity_id_a AS equivalent_entity
FROM entity_entity
WHERE entity_id_b = @entity_id
select * from entities
where entity_id in
(
select entity_id_b
from entity_entity
where entity_id_a = @lookup_value
)
Based on your updated schema this query should work:
select if(entity_id_a=:entity_id,entity_id_b,entity_id_a) as related_entity_id where :entity_id in (entity_id_a, entity_id_b)
where :entity_id is bound to the entity you are querying
My advice is that your intial table design is bad. Do not store different types of things in the same table. (First rule of database design, right up there with do not store multiple pieces of information in the same field). This is much harder to query and will cause significant performance problems down the road. Plus it would be a problem entering the data into the realtionship table - how do you know what entities would need to be realted when you do a new entry? It would be much better to design properly relational tables. Entity tables are almost always a bad idea. I see no reason at all from the example to have this type of information in one table. Frankly I'd have a university table and a related address table. It would easy to query and perform far better.
来源:https://stackoverflow.com/questions/474119/what-is-the-best-way-to-represent-a-many-to-many-relationship-between-records-in