If I have an existing pandas dataframe, is there a way to generate the python code, which when executed in another python script, will reproduce that dataframe.
e.g.
You can first save the dataframe you have, and then load in another python script when necessary. You can do it with two packages: pickle
and shelve
.
pickle
:import pandas as pd
import pickle
df = pd.DataFrame({'user': ['Bob', 'Jane', 'Alice'],
'income': [40000, 50000, 42000]})
with open('dataframe', 'wb') as pfile:
pickle.dump(df, pfile) # save df in a file named "dataframe"
To read the dataframe in another file:
import pickle
with open('dataframe', 'rb') as pfile:
df2 = pickle.load(pfile) # read the dataframe stored in file "dataframe"
print(df2)
Output:
income user
0 40000 Bob
1 50000 Jane
2 42000 Alice
shelve
:import pandas as pd
import shelve
df = pd.DataFrame({'user': ['Bob', 'Jane', 'Alice'],
'income': [40000, 50000, 42000]})
with shelve.open('dataframe2') as shelf:
shelf['df'] = df # store the dataframe in file "dataframe"
To read the dataframe in another file:
import shelve
with shelve.open('dataframe2') as shelf:
print(shelf['df']) # read the dataframe
Output:
income user
0 40000 Bob
1 50000 Jane
2 42000 Alice