Cache consistency when using memcached and a rdbms like MySQL

放肆的年华 提交于 2019-12-03 12:07:52

This article gives you an interesting note on how Facebook (tries to) maintain cache consistency : http://www.25hoursaday.com/weblog/2008/08/21/HowFacebookKeepsMemcachedConsistentAcrossGeoDistributedDataCenters.aspx

Here's a gist from the article.

  1. I update my first name from "Jason" to "Monkey"
  2. We write "Monkey" in to the master database in California and delete my first name from memcache in California but not Virginia
  3. Someone goes to my profile in Virginia
  4. We find my first name in memcache and return "Jason"
  5. Replication catches up and we update the slave database with my first name as "Monkey." We also delete my first name from Virginia memcache because that cache object showed up in the replication stream
  6. Someone else goes to my profile in Virginia
  7. We don't find my first name in memcache so we read from the slave and get "Monkey"

How about using a variable save in memcache as a lock signal?

every single memcache command is atomic

after you retrieved data from db, toggle lock on

after you put data to memcache, toggle lock off

before delete from db, check lock state

The code below gives some idea of how to use Memcached's operations add, gets and cas to implement optimistic locking to ensure consistency of cache with the database.
Disclaimer: i do not guarantee that it's perfectly correct and handles all race conditions. Also consistency requirements may vary between applications.

def read(k):
  loop:
    get(k)
    if cache_value == 'updating':
      handle_too_many_retries()
      sleep()
      continue
    if cache_value == None:
      add(k, 'updating')
      gets(k)
      get_from_db(k)
      if cache_value == 'updating':
        cas(k, 'value:' + version_index(db_value) + ':' + extract_value(db_value))
      return db_value
    return extract_value(cache_value)

def write(k, v):
  set_to_db(k, v)
  loop:
    gets(k)
    if cache_value != 'updated' and cache_value != None and version_index(cache_value) >= version_index(db_value):
      break
    if cas(k, v):
      break
    handle_too_many_retries()

# for deleting we can use some 'tumbstone' as a cache value
Quinton Hsu

When you read, the following happens:

if(Key is not in cache){
  fetch data from db
  put(key,value);
}else{
  return get(key)
}

When you write, the following happens:

1 delete/update data from db
2 clear cache
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!