问题
I have finally figured out how to use _metadata from a DataFrame, everything works except I am unable to persist it such as to hdf5 or json. I know it works because I copy the frame and _metadata attributes copy over "non _metadata" attributes don't.
example
df = pandas.DataFrame #make up a frame to your liking
pandas.DataFrame._metadata = ["testmeta"]
df.testmeta = "testmetaval"
df.badmeta = "badmetaval"
newframe = df.copy()
newframe.testmeta -->outputs "testmetaval"
newframe.badmeta ---> raises attribute error
#json test
df.to_json(Path)
revivedjsonframe = pandas.io.json.read_json(Path)
revivedjsonframe.testmeta ---->raises Attribute Error
#hdf5 test
revivedhdf5frame.testmeta ---> returns None
this person https://stackoverflow.com/a/25715719/4473236 says it worked for him but I'm new to this site (and pandas) and can't post to that thread or ask him directly.
回答1:
_metadata
is prefaced with an underscore, which means it's not part of the public API. It's not intended for user code -- we might break it in any future version of pandas without warning.
I would strongly recommend against using this "feature". For now, the best option for persisting metadata with a DataFrame is probably to write your own wrapper class and handle the persistence yourself.
回答2:
This is my code which works using python 3.3.3.2 64-bit
In [69]:
df = pd.DataFrame() #make up a frame to your liking
pd.DataFrame._metadata = ["testmeta"]
print(pd.DataFrame._metadata)
df.testmeta = "testmetaval"
df.badmeta = "badmetaval"
newframe = df.copy()
print(newframe.testmeta)
print("newframe", newframe.badmeta)
df.to_json(r'c:\data\test.json')
read_json = pd.read_json(r'c:\data\test.json')
read_json.testmeta
print(pd.version.version)
print(np.version.full_version)
Out[69]:
['testmeta']
testmetaval
newframe badmetaval
0.15.2
1.9.1
JSON contents as df:
In [70]:
read_json
Out[70]:
Empty DataFrame
Columns: []
Index: []
In [71]:
read_json.info()
<class 'pandas.core.frame.DataFrame'>
Float64Index: 0 entries
Empty DataFrame
In [72]:
read_json.testmeta
Out[72]:
'testmetaval'
Strangely the json that is written is just an empty parentheses:
{}
which would indicate that the metadata is actually being propagated by the statement line: pd.DataFrame._metadata = ["testmeta"]
Seems to still work if you overwrite a 2nd atrtibute's metadata:
In [75]:
df.testmeta = 'foo'
df2 = pd.DataFrame()
df2.testmeta = 'bar'
read_json = pd.read_json(r'c:\data\test.json')
print(read_json.testmeta)
print(df2.testmeta)
testmetaval
bar
来源:https://stackoverflow.com/questions/28041762/pandas-metadata-of-dataframe-persistence-error