Redis Python - how to delete all keys according to a specific pattern In python, without python iterating

后端 未结 9 515
暗喜
暗喜 2020-12-25 10:42

I\'m writing a django management command to handle some of our redis caching. Basically, I need to choose all keys, that confirm to a certain pattern (for example: \"prefix:

相关标签:
9条回答
  • 2020-12-25 10:57

    From the Documentation

    delete(*names)
        Delete one or more keys specified by names
    

    This just wants an argument per key to delete and then it will tell you how many of them were found and deleted.

    In the case of your code above I believe you can just do:

        redis.delete(*x)
    

    But I will admit I am new to python and I just do:

        deleted_count = redis.delete('key1', 'key2')
    
    0 讨论(0)
  • 2020-12-25 10:59

    I think the

     for key in x: cache.delete(key)
    

    is pretty good and concise. delete really wants one key at a time, so you have to loop.

    Otherwise, this previous question and answer points you to a lua-based solution.

    0 讨论(0)
  • 2020-12-25 11:01

    Use SCAN iterators: https://pypi.python.org/pypi/redis

    for key in r.scan_iter("prefix:*"):
        r.delete(key)
    
    0 讨论(0)
  • 2020-12-25 11:05

    Here is a full working example using py-redis:

    from redis import StrictRedis
    cache = StrictRedis()
    
    def clear_ns(ns):
        """
        Clears a namespace
        :param ns: str, namespace i.e your:prefix
        :return: int, cleared keys
        """
        count = 0
        ns_keys = ns + '*'
        for key in cache.scan_iter(ns_keys):
            cache.delete(key)
            count += 1
        return count
    

    You can also do scan_iter to get all the keys into memory, and then pass all the keys to delete for a bulk delete but may take a good chunk of memory for larger namespaces. So probably best to run a delete for each key.

    Cheers!

    UPDATE:

    Since writing the answer, I started using pipelining feature of redis to send all commands in one request and avoid network latency:

    from redis import StrictRedis
    cache = StrictRedis()
    
    def clear_cache_ns(ns):
        """
        Clears a namespace in redis cache.
        This may be very time consuming.
        :param ns: str, namespace i.e your:prefix*
        :return: int, num cleared keys
        """
        count = 0
        pipe = cache.pipeline()
        for key in cache.scan_iter(ns):
            pipe.delete(key)
            count += 1
        pipe.execute()
        return count
    

    UPDATE2 (Best Performing):

    If you use scan instead of scan_iter, you can control the chunk size and iterate through the cursor using your own logic. This also seems to be a lot faster, especially when dealing with many keys. If you add pipelining to this you will get a bit of a performance boost, 10-25% depending on chunk size, at the cost of memory usage since you will not send the execute command to Redis until everything is generated. So I stuck with scan:

    from redis import StrictRedis
    cache = StrictRedis()
    CHUNK_SIZE = 5000
    
    def clear_ns(ns):
        """
        Clears a namespace
        :param ns: str, namespace i.e your:prefix
        :return: int, cleared keys
        """
        cursor = '0'
        ns_keys = ns + '*'
        while cursor != 0:
            cursor, keys = cache.scan(cursor=cursor, match=ns_keys, count=CHUNK_SIZE)
            if keys:
                cache.delete(*keys)
    
        return True
    

    Here are some benchmarks:

    5k chunks using a busy Redis cluster:

    Done removing using scan in 4.49929285049
    Done removing using scan_iter in 98.4856731892
    Done removing using scan_iter & pipe in 66.8833789825
    Done removing using scan & pipe in 3.20298910141
    

    5k chunks and a small idle dev redis (localhost):

    Done removing using scan in 1.26654982567
    Done removing using scan_iter in 13.5976779461
    Done removing using scan_iter & pipe in 4.66061878204
    Done removing using scan & pipe in 1.13942599297
    
    0 讨论(0)
  • 2020-12-25 11:05

    Use delete_pattern: https://niwinz.github.io/django-redis/latest/

    from django.core.cache import cache
    cache.delete_pattern("prefix:*")
    
    0 讨论(0)
  • 2020-12-25 11:09

    According to my test, it will costs too much time if I use scan_iter solution (as Alex Toderita wrote).

    Therefore, I prefer to use:

    from redis.connection import ResponseError
    
    try:
        redis_obj.eval('''return redis.call('del', unpack(redis.call('keys', ARGV[1])))''', 0, 'prefix:*')
    except ResponseError:
        pass
    

    The prefix:* is the pattern.


    refers to: https://stackoverflow.com/a/16974060

    0 讨论(0)
提交回复
热议问题