Is storing counts of database record redundant?

坚强是说给别人听的谎言 提交于 2019-12-21 11:21:13

问题


I'm using Rails and MySQL, and have an efficiency question based on row counting.

I have a Project model that has_many :donations.

I want to count the number of unique donors for a project.

Is having a field in the projects table called num_donors, and incrementing it when a new donor is created a good idea?

Or is something like @num_donors = Donor.count(:select => 'DISTINCT user_id') going to be similar or the same in terms of efficiency thanks to database optimization? Will this require me to create indexes for user_id and any other fields I want to count?

Does the same answer hold for summing the total amount donated?


回答1:


To answer the title question. Yes it is redundant, but whether you should do it depends on your situation.

Unless you have known performance problems, calculate the counts and totals on the fly in your application and don't store them. That is, don't store calculated values unless you have no other choice.

In most situations, you wont have to resort to this and shouldn't.

If you must store calculated values, do the following:

  • Don't keep it up-to date by incrementing it. Recalculate the count/total from all the data each time you update it.
  • If you don't have a lot of updates, put the code in an update trigger to keep the count/totals up to date.
  • The trouble with redundancy in databases is that when the numbers disagree, you are unsure of which is authoritative. Add to the documentation a note that the source data is the authoritative source if they disagree and can be overwritten.



回答2:


While it depends on the size of your database, these are the kinds of operations that databases specialize in, so they should be fast. It's probably a case of premature optimization here - you should start by not storing the totals, thus making it simpler - and optimize later if necessary.




回答3:


Remember the maxim "A man with one watch always knows the time. A man with two watches is never sure." I would only store the derived number if:

Performance issues stop you from getting the derived numbers when you need them (which should not be a problem in this case since the answer is likely to be available from the indexes)

or

You have reason to believe that you are losing records from the main table through programmer error or deliberate or accidental user action. In that case, you can use your the derived number to audit the currently calculated number.




回答4:


Peter's and JohnFx's answers are sound, what you're proposing is the denormalization of your database schema, which can improve read performance but at the detriment of writes while additionally putting the onus on the developer (or additional DBMS clevers) to prevent inconsistencies within your dataset.

ActiveRecord has some built in functionality to automatically manage counts on has_many relationships. Check out this Railscast on counter caches.




回答5:


Do you know that a simple flag does the ActiveRecord magic?

class ThingOwner

# it has a column like
# t.integer things_count, :default => 0

has_many :things, :counter_cache => true

end

As for the question - yeah, sure it is redundant, I would add such a counter if and only if things.count's share of time is too large.

Otherwise it's premature optimization.



来源:https://stackoverflow.com/questions/1512402/is-storing-counts-of-database-record-redundant

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!