After setting a DataFrame to redis, then getting it back, redis returns a string and I can\'t figure out a way to convert this str to a DataFrame.
How can I do these
I couldn't use msgpack because of Decimal
objects in my dataframe. Instead I combined pickle and zlib together like this, assuming a dataframe df
and a local instance of Redis:
import pickle
import redis
import zlib
EXPIRATION_SECONDS = 600
r = redis.StrictRedis(host='localhost', port=6379, db=0)
# Set
r.setex("key", EXPIRATION_SECONDS, zlib.compress( pickle.dumps(df)))
# Get
rehydrated_df = pickle.loads(zlib.decompress(r.get("key")))
There isn't anything dataframe specific about this.
Caveats
msgpack
is better -- use it if it works for youset:
redisConn.set("key", df.to_msgpack(compress='zlib'))
get:
pd.read_msgpack(redisConn.get("key"))
import pandas as pd
df = pd.DataFrame([1,2])
redis.setex('df',100,df.to_json())
df = redis.get('df')
df = pd.read_json(df)
For caching a dataframe use this.
import pyarrow as pa
def cache_df(alias,df):
pool = redis.ConnectionPool(host='host', port='port', db='db')
cur = redis.Redis(connection_pool=pool)
context = pa.default_serialization_context()
df_compressed = context.serialize(df).to_buffer().to_pybytes()
res = cur.set(alias,df_compressed)
if res == True:
print('df cached')
For fetching the cached dataframe use this.
def get_cached_df(alias):
pool = redis.ConnectionPool(host='host',port='port', db='db')
cur = redis.Redis(connection_pool=pool)
context = pa.default_serialization_context()
all_keys = [key.decode("utf-8") for key in cur.keys()]
if alias in all_keys:
result = cur.get(alias)
dataframe = pd.DataFrame.from_dict(context.deserialize(result))
return dataframe
return None