Storing “votes” in a database

天大地大妈咪最大 提交于 2019-12-02 20:56:19

Personally as long as you have good indexes in place, you are going about it the right way. Depending on your usage, for performance you might try to avoid hitting the votes table by storing secondary count information, but overall if you must track WHO has voted something, you need to do it in the way you have listed.

I wouldn't bother moving to another database, if you are REALLY concerned in SQL Server you could create a separate filegroup to hold it.....but most likely not necessary.

Well, yes but you need to look at the bigger picture. With a million pieces of CONTENT:

(Size of Content) >> (Size of Votes) : where ">>" means "much greater."

If you have a million pieces of content then that might be a terabyte of data where as the votes are 400MB. Big deal right?

I would also add, if you are worried about scalability, check out this blog:

http://highscalability.com/

If you need to track whether a user has voted for a particular item, and if there are different values of vote (so 1 star to 5 stars, for example), then this is about as compact as it gets.

Don't forget that for sensible access speeds, you'll need to index the data (two indexes, probably - one with ContentID as the leading column, one with userID as the leading column).

You'll need to decide whether there is a reason not to store the table separately from other tables. What this means depends on the DBMS you use - with Informix, the table would be in the same database but stored in a different dbspace, and you might have the indexes stored in two other different dbspaces.

You will probably also want the ID of the author of the content in the table, for easier detection of voting abuse. (Yes, this is presumably redundant information. An alternative is regularly building a summary table to see who is voting on whom.)

For what it's worth, the perlmonks vote table looks like this:

 `vote_id` int(11) NOT NULL default '0',
 `voter_user` int(11) NOT NULL default '0',
 `voted_user` int(11) default NULL,
 `weight` int(11) NOT NULL default '0',
 `votetime` datetime NOT NULL default '0000-00-00 00:00:00',
 `ip` varchar(16) default NULL,
 PRIMARY KEY  (`vote_id`,`voter_user`),
 KEY `voter_user_idx` (`voter_user`,`votetime`),
 KEY `voted_user_idx` (`voted_user`,`votetime`)

(vote_id is the content id, ip is an IP address.)

I'd say you need to figure out how these votes will be used and design specific queries for your data model first. This is not necessarily SQL model. If you're coming from SQL world, passing through official MongoDB tutorial helps to clear the mind for the start.

For example, if you only need to store and display votes for a single issue page, it may be convenient to store votes in a single string field of the issue, which will look like id1:id2:id3:. Assuming all ids are of the same length, there are some interesting properties:

  1. Count all votes for the issue:

    len(issue.votes)/len(id)

  2. Find I voted on the issue

    myid in issue.votes

  3. Find all issues you voted on:

    select issue.id from issues where issue.votes contains(myid)

  4. Find the most voted issues

    select issue.id from issues order by len(issue.votes) desc limit 10

This architecture allows to avoid expensive calculations on read in these specific cases, but updating issue.votes on vote might be more expensive than adding a row in a table. In this case 100 votes with 4 bytes per id + separator is 500 bytes string. In your proposed variant 100 votes are 800 bytes.

Disclaimer: I never implemented anything like this, it is just an idea.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!