Related
Using SO as an example, what is the most sensible way to manage tags if you anticipate they will change often?
I would argue that there is a fourth solution which is a variation on your third solution:
Create Table Posts
(
id ...
, title ...
)
Create Table Tags
(
name varchar(30) not null primary key
, ...
)
Create Table PostTags
(
PostId ...
, TagName varchar(30) not null
, Constraint FK_PostTags_Posts
Foreign Key ( PostId )
References Posts( Id )
, Constraint FK_PostTags_Tags
Foreign Key ( TagName )
References Tags( Name )
On Update Cascade
On Delete Cascade
)
Notice that I'm using the tag name as the primary key of the Tags table. In this way, you can filter on certain tags without the extra join to the Tags table itself. In addition, if you change a tag name, it will update the names in the PostTags table. If changing a tag name is a rare occurrence, then this shouldn't be a problem. If changing a tag name is a common occurrence, then I would go with your third solution where you use a surrogate key to reference the tag.
I personally favour solution #3.
I don't agree that solution #1 is easier to mantain. Think of the situation where you have to change the name of a tag.
Solution #1:
UPDATE posts SET tag = REPLACE(tag, "oldname", "newname") WHERE tag LIKE("%oldname%")
Solution #3:
UPDATE tags SET tag = "newname" WHERE tag = "oldname"
The first one is way heavier.
Also you have to deal with the commas when deleting tags (OK, it's easily done but still, more difficult that just deleting one line in the taggings
table)
As for solution #2... is neither fish nor fowl
I think that SO uses solution #1. I'd go with either #1 or #3.
One thing to consider is if you have several thing that you can tag (e.g. adding tags to both post and products, for example). This may affect database solution.
Well I have the same doubt I adopted the third solution for my website. I know there is another way for dealing with this problem of variable-length tuples which consists in using columns as rows in this way you will have some information identifying the tuple redudant and the varying ones organized one for each row.
+--------+-------+-------------------------------------+
| postId | label | value |
+--------+-------+-------------------------------------+
| 1 | tag |C++ |
+--------+-------+-------------------------------------+
| 1 | tag |search |
+--------+-------+-------------------------------------+
| 1 | tag |code |
+--------+-------+-------------------------------------+
| 1 | title | Why is a raven like a writing desk? |
+--------+-------+-------------------------------------+
This is really bad but sometimes it's the only feasible solution, and it's very far from the relational approach.
These solutions are called mysqlicious
, scuttle
and toxi
.
This article compares benefits and drawbacks of each.