Normally I would agree with Yaakov Ellis but in this special case there is another viable solution:
Use two tables:
Table: Item
Columns: ItemID, Title, Content
Indexes: ItemID
Table: Tag
Columns: ItemID, Title
Indexes: ItemId, Title
This has some major advantages:
First it makes development much simpler: in the three-table solution for insert and update of item
you have to lookup the Tag
table to see if there are already entries. Then you have to join them with new ones. This is no trivial task.
Then it makes queries simpler (and perhaps faster). There are three major database queries which you will do: Output all Tags
for one Item
, draw a Tag-Cloud and select all items for one Tag Title.
All Tags for one Item:
3-Table:
SELECT Tag.Title
FROM Tag
JOIN ItemTag ON Tag.TagID = ItemTag.TagID
WHERE ItemTag.ItemID = :id
2-Table:
SELECT Tag.Title
FROM Tag
WHERE Tag.ItemID = :id
Tag-Cloud:
3-Table:
SELECT Tag.Title, count(*)
FROM Tag
JOIN ItemTag ON Tag.TagID = ItemTag.TagID
GROUP BY Tag.Title
2-Table:
SELECT Tag.Title, count(*)
FROM Tag
GROUP BY Tag.Title
Items for one Tag:
3-Table:
SELECT Item.*
FROM Item
JOIN ItemTag ON Item.ItemID = ItemTag.ItemID
JOIN Tag ON ItemTag.TagID = Tag.TagID
WHERE Tag.Title = :title
2-Table:
SELECT Item.*
FROM Item
JOIN Tag ON Item.ItemID = Tag.ItemID
WHERE Tag.Title = :title
But there are some drawbacks, too: It could take more space in the database (which could lead to more disk operations which is slower) and it's not normalized which could lead to inconsistencies.
The size argument is not that strong because the very nature of tags is that they are normally pretty small so the size increase is not a large one. One could argue that the query for the tag title is much faster in a small table which contains each tag only once and this certainly is true. But taking in regard the savings for not having to join and the fact that you can build a good index on them could easily compensate for this. This of course depends heavily on the size of the database you are using.
The inconsistency argument is a little moot too. Tags are free text fields and there is no expected operation like 'rename all tags "foo" to "bar"'.
So tldr: I would go for the two-table solution. (In fact I'm going to. I found this article to see if there are valid arguments against it.)