Optimizing queries for the next and previous element

前端 未结 11 910
耶瑟儿~
耶瑟儿~ 2021-01-30 11:33

I am looking for the best way to retrieve the next and previous records of a record without running a full query. I have a fully implemented solution in place, and would like to

11条回答
  •  鱼传尺愫
    2021-01-30 11:44

    In general, I denormalize the data from the indexes. They may be stored in the same rows, but I almost always retrieve my result IDs, then make a separate trip for the data. This makes caching the data very simple. It's not so important in PHP where the latency is low and the bandwidth high, but such a strategy is very useful when you have a high latency, low bandwidth application, such as an AJAX website where much of the site is rendered in JavaScript.

    I always cache the lists of results, and the results themselves separately. If anything affects the results of a list query, the cache of the list results is refreshed. If anything affects the results themselves, those particular results are refreshed. This allows me to update either one without having to regenerate everything, resulting in effective caching.

    Since my lists of results rarely change, I generate all the lists at the same time. This may make the initial response slightly slower, but it simplifies cache refreshing (all the lists get stored in a single cache entry).

    Because I have the entire list cached, it's trivial to find neighbouring items without revisiting the database. With luck, the data for those items will also be cached. This is especially handy when sorting data in JavaScript. If I already have a copy cached on the client, I can resort instantly.

    To answer your questions specifically:

    • Yes, it's a fantastic idea to find out the neighbours ahead of time, or whatever information the client is likely to access next, especially if the cost is low now and the cost to recalculate is high. Then it's simply a trade off of extra pre-calculation and storage versus speed.
    • In terms of performance and simplicity, avoid tying things together that are logically different things. Indexes and data are different, are likely to be changed at different times (e.g. adding a new datum will affect the indexes, but not the existing data), and thus should be accessed separately. This may be slightly less efficient from a single-threaded standpoint, but every time you tie something together, you lose caching effectiveness and asychronosity (the key to scaling is asychronosity).
    • The term for getting data ahead of time is pre-fetching. Pre-fetching can happen at the time of access or in the background, but before the pre-fetched data is actually needed. Likewise with pre-calculation. It's a trade-off of cost now, storage cost, and cost to get when needed.
    • "Sorting cache" is an apt name.
    • I don't know.

    Also, when you cache things, cache them at the most generic level possible. Some stuff might be user specific (such as results for a search query), where others might be user agnostic, such as browsing a catalog. Both can benefit from caching. The catalog query might be frequent and save a little each time, and the search query may be expensive and save a lot a few times.

提交回复
热议问题