I\'ve created a forum, and we\'re implementing an apc and memcache caching solution to save the database some work.
I started implementing the cache layer with keys like
One possible solution is not to paginate the cache of threads in a forum, but rather put the thread information in to Forum::getThreads|$iForumId
. Then in your PHP code only pull out the ones you want for that given page, e.g.
$page = 2;
$threads_per_page = 25;
$start_thread = $page * $threads_per_page;
// Pull threads from cache (assuming $cache class for memcache interface..)
$threads = $cache->get("Forum::getThreads|$iForumId");
// Only take the ones we need
for($i=$start_thread; $i<=$start_thread+$threads_per_page; $i++)
{
// Thread display logic here...
showThread($threads[$i]);
}
This means that you do have a bit more work to do pulling them out on each page, but now only have to worry about invalidating the cache in one place on update / addition of new thread.
Be very careful about doing this kind of optimisation without having hard facts to measure against.
Most databases have several levels of caches. If these are tuned correctly, the database will probably do a much better job at caching, than you can do your self.
You're essentially trying to cache a view, which is always going to get tricky. You should instead try to cache data only, because data rarely changes. Don't cache a forum, cache the thread rows. Then your db call should just return a list of ids, which you already have in your cache. The db call will be lightening fast on any MyISAM table, and then you don't have to do a big join, which eats db memory.
Just an update: I decided that Josh's point on data usage was a very good one. People are unlikely to keep viewing page 50 of a forum.
Based on this model, I decided to cache the 90 latest threads in each forum. In the fetching function I check the limit and offset to see if the specified slice of threads is within cache or not. If it is within the cache limit, I use array_slice() to retrieve the right part and return it.
This way, I can use a single cache key per forum, and it takes very little effort to clear/update the cache :-)
I'd also like to point out that in other more resource heavy queries, I went with flungabunga's model, storing the relations between keys. Unfortunately Stack Overflow won't let me accept two answers.
Thanks!
You might also want to have a look at the cost of storing the cache data, in terms of your effort and CPU cost, against how what the cache will buy you.
If you find that 80% of your forum views are looking at the first page of threads, then you could decide to cache that page only. That would mean both cache reads and writes are much simpler to implment.
Likewise with the list of a user's favourite threads. If this is something that each person visits rarely then cache might not improve performance too much.
In response to flungabunga:
Another way to implement grouping is to put the group name plus a sequence number into the keys themselves and increment the sequence number to "clear" the group. You store the current valid sequence number for each group in its own key.
e.g.
get seqno_mygroup
23
get mygroup23_mykey
<mykeydata...>
get mygroup23_mykey2
<mykey2data...>
Then to "delete" the group simply:
incr seqno_mygroup
Voila:
get seqno_mygroup
24
get mygroup24_mykey
...empty
etc..