Most efficient database design for a blog (posts and comments)

后端 未结 5 2016
你的背包
你的背包 2021-01-31 12:15

What would be the best way of designing a database to store blog posts and comments? I am currently thinking one table for posts, and another for comments, each with a post ID.<

相关标签:
5条回答
  • 2021-01-31 12:34

    It seems to me, however, trawling through a large table of comments

    All the database vendors agree with you.

    They offer "indexes" to limit this.

    0 讨论(0)
  • 2021-01-31 12:42

    Every database system you would be using to implement your blog will use indexing. What this means is that, rather than "trawling through a large table", your database system maintains a seperate list of comments and which posts they are associated with, much like the index at the back of a book. This allows the database system to load comments associated with a post extremely quickly, and I don't see any problems with your proposed design for a blog of any size.

    Indexes are routinely used to associate tables with millions of rows with other tables with millions of rows - you would have to have an exceptionally large blog to require denormalization of comments, and even still, caching would probably serve you far better than denormalizing the database.

    You will need to define an index on your comments table, and associate it with whatever column holds the Post ID. How that's done is dependent on what database system you are using.

    0 讨论(0)
  • 2021-01-31 12:45

    Okay, let's see.

    trawling through a large table of comments to find those for the relevant post would be expensive

    Why do you think it'd be expensive? Because you possibly believe that a linear search will be done every time taking O(n) time. For a billion comments, a billion iterations will be done.

    Now suppose a binary search tree is constructed for comment_ID. To look up any comment, you need log(n) time [base 2]. So for even 1 billion comments, only around 32 iterations will be needed.

    Now consider a slightly modified BST, where each node contains k elements instead of 1 (in a list) and has k+1 children nodes. The same properties of BST are followed in this data structure as well. What we've got here is called a B-tree. More reading : GeeksForGeeks - B Tree Introduction

    For a B-Tree, the lookup time is log(n) [base k]. Hence, if k=10, for 1 billion entries, only 9 iterations will be needed.

    All databases save indexes for primary keys in B-Trees. Hence, the stated task would not be expensive, and you should go ahead and design the database the way it seemed obvious.

    PS: You can build an index on any column of the table. By default primary key indexes are already stored. But be careful, do not make unnecessary indexes as they take up disk space.

    0 讨论(0)
  • 2021-01-31 12:46

    trawling through a large table of comments to find those for the relevant post would be expensive,

    An index is always there to rescue you! First index on postId and another of commentdate (desc)

    0 讨论(0)
  • 2021-01-31 12:50

    try something like this:

    Blog
    BlogID     int auto number PK
    BlogName   string
    ...
    
    BlogPost
    BlogPostID   int auto number PK
    BlogID       int FK to Blog.BlogID, index
    BlogContent  string
    ....
    
    Comment
    CommentID       int auto number PK
    BlogPostID      int FK to BlogPost.BlogPostID, index   
    ReplyToCommentID int FK to Comment.CommentID  <<for comments on comments
    ...
    
    0 讨论(0)
提交回复
热议问题