How to set/get pandas.DataFrame to/from Redis?

后端 未结 4 833
孤城傲影
孤城傲影 2020-12-23 14:38

After setting a DataFrame to redis, then getting it back, redis returns a string and I can\'t figure out a way to convert this str to a DataFrame.

How can I do these

相关标签:
4条回答
  • 2020-12-23 15:03

    I couldn't use msgpack because of Decimal objects in my dataframe. Instead I combined pickle and zlib together like this, assuming a dataframe df and a local instance of Redis:

    import pickle
    import redis
    import zlib
    
    EXPIRATION_SECONDS = 600
    
    r = redis.StrictRedis(host='localhost', port=6379, db=0)
    
    # Set
    r.setex("key", EXPIRATION_SECONDS, zlib.compress( pickle.dumps(df)))
    
    # Get
    rehydrated_df = pickle.loads(zlib.decompress(r.get("key")))
    

    There isn't anything dataframe specific about this.

    Caveats

    • the other answer using msgpack is better -- use it if it works for you
    • pickling can be dangerous -- your Redis server needs to be secure or you're asking for trouble
    0 讨论(0)
  • 2020-12-23 15:06

    set:

    redisConn.set("key", df.to_msgpack(compress='zlib'))
    

    get:

    pd.read_msgpack(redisConn.get("key"))
    
    0 讨论(0)
  • 2020-12-23 15:18
    import pandas as pd
    df = pd.DataFrame([1,2])
    redis.setex('df',100,df.to_json())
    df = redis.get('df')
    df = pd.read_json(df)
    
    0 讨论(0)
  • 2020-12-23 15:24

    For caching a dataframe use this.

    import pyarrow as pa
    
    def cache_df(alias,df):
    
        pool = redis.ConnectionPool(host='host', port='port', db='db')
        cur = redis.Redis(connection_pool=pool)
        context = pa.default_serialization_context()
        df_compressed =  context.serialize(df).to_buffer().to_pybytes()
    
        res = cur.set(alias,df_compressed)
        if res == True:
            print('df cached')
    

    For fetching the cached dataframe use this.

    def get_cached_df(alias):
    
        pool = redis.ConnectionPool(host='host',port='port', db='db') 
        cur = redis.Redis(connection_pool=pool)
        context = pa.default_serialization_context()
        all_keys = [key.decode("utf-8") for key in cur.keys()]
    
        if alias in all_keys:   
            result = cur.get(alias)
    
            dataframe = pd.DataFrame.from_dict(context.deserialize(result))
    
            return dataframe
    
        return None
    
    0 讨论(0)
提交回复
热议问题