I have a PHP web application which uses a MySQL database for object tagging, in which I\'ve used the tag structure accepted as the answer to this SO question.
I\'d l
I would use some kind of array to store the children tags, this should be a lot faster than joining a table on itself (especially if you have a large number of tags). I had a look, and I can't tell if mysql has a native array data type, but you can emulate this by using a text column and storing a serialized array in it. If you want to speed things up further, you should be able to put a text search index on that column to find out which tags are related.
[Edit] After reading Ali's article, I did some more hunting and found this presentation on a bunch of approaches for implementing hierarchies in postgres. Might still be helpful for explanatory purposes.
I implemented it using two columns. I simplify it here a little, because I had to keep the tag name in a separate field/table because I had to localize it for different languages:
Look at these rows for example:
tag path
--- ----
database database/
mysql database/mysql/
mysql4 database/mysql/mysql4/
mysql4-1 database/mysql/mysql4-1/
oracle database/oracle/
sqlserver database/sqlserver/
sqlserver2005 database/sqlserver/sqlserver2005/
sqlserver2005 database/sqlserver/sqlserver2008/
etc.
Using the like
operator on the path field you can easily get all needed tag rows:
SELECT * FROM tags WHERE path LIKE 'database/%'
There are some implementation details like when you move a node in the hierarchy you have to change all children too etc., but it's not hard.
Also make sure that the length of your path is long enough - in my case I used not the tag name for the path, but another field to make sure that I don't get too long paths.
You could build what Kimball calls a Hierarchy Helper Table.
Say you hierarchy looks like this: A -> B | B -> C | C -> D
you'd insert records into a table that looks like this
ParentID, ChildID, Depth, Highest Flag, Lowest Flag
A, A, 0, Y, N
A, B, 1, N, N
A, C, 2, N, N
A, D, 3, N, Y
B, B, 0, N, N
B, C, 1, N, N
B, D, 2, N, Y
C, C, 0, N, N
C, D, 1, N, Y
D, D, 0. N, Y
I think I have that correct.... anyways. The point is you still store you hierarchy correctly, you just build this table FROM your proper table. THIS table queries like a Banshee. Say you want to know what all the first level below B are.
WHERE parentID = 'B' and Depth = 1
A few ways here
Ali's answer has a link to Joe Celko's Trees and Hierarchies in SQL for Smarties, which confirms my suspicion - there isn't a simple database structure that offers the best of all worlds. The best for my purpose seems to be the "Frequent Insertion Tree" detailed in this book, which is like the "Nested Set Model" of Ali's link, but with non-consecutive indexing. This allows O(1) insertion (a la unstructured BASIC line numbering), with occasional index reorganisation as and when needed.