Best practise to store total values or to sum up at runtime

前端 未结 2 1627
悲哀的现实
悲哀的现实 2021-01-13 15:40

I have an application where there a number of calculations for each \'job revision\' – this involves the sum of multiple rows linked to that revision.

What I\'m wond

相关标签:
2条回答
  • 2021-01-13 15:51

    This is highly dependent on several factors, such as:

    • How important is accuracy? Does the figure need to be exactly accurate as of the moment the figure is displayed?
    • How important is speed of retrieval? Is it more important that the figure be displayed promptly, than that it be accurate?
    • How important is speed of writing? Is it more important that each "row" be written-to quickly?
    • How much data is involved? Are we talking about millions of rows, or hundreds?
    • How often are multiple rows updated at the same time? Is it generally the case that many rows of a "job revision" are updated all at once, or is it more-likely that only a single row at a time will be involved?

    Each of these has different trade-offs, and there are multiple options for satisfying the balance of each.

    • You can create triggers wich updates the "cached" total every time a new row is inserted , updated, or deleted. This will ensure both accuracy and fast retrieval, but will slow-down writes, as it involves an extra update of the "cache" table every time, which in-turn involves a row-level lock to ensure the number remains accurate. How much the lock slows down other operations will depend on how often other operations on the same "job revision" happen at the same time.
    • You can just calculate the value every time. This ensures the value is accurate, and that inserts/updates are fast, but calculating the value may be inefficient and slow.
    • You can store the value in a cached table, but only update it periodically. This ensures that both reads and writes are fast, at the expense of accuracy.

    If you are only dealing with a handful of rows, there is not much difference between any of these, and "just calculate it every time" is fine. If there is no actual need to make things faster, calculating every time is preferred as "more normal"

    As always: don't just ask for what's good in the general case. Profile your own code, see where the bottlenecks are, and tend to follow the rule "Don't, yet".

    0 讨论(0)
  • 2021-01-13 16:06

    It depends.

    If there's convenient way to work it out as you go along, then you don't need a probably expensive query to calculate it out. The big problem with pre-calculating though is the extra levels of complexity that you can be lumbered with maintaining the totals and the data.

    It's design compromise, so there's no right answer about where you'll end up. All I would say is you should treat pre-calculation of totals as an optmisation, so if you don't know you need to do it yet, it's a premature optmisation.

    If things get out of step, you can end up with a significant amount of work squaring up your data. If they are getting out of step through unfixed bugs or unplugged gaps, it's a significant and ongoing amount of work.

    Not something you should need to or in fact should have to rush into.

    0 讨论(0)
提交回复
热议问题