How to write/read a Pandas DataFrame with MultiIndex from/to an ASCII file?

前端未结

关注

 2  2168

I want to be able to create a Pandas DataFrame with MultiIndexes for the rows and the columns index and read it from an ASCII text file. My data looks like:

相关标签:

2条回答

忘掉有多难

2021-02-14 10:47

You can change the print options using set_option:

display.multi_sparse:
: boolean
Default True, "sparsify" MultiIndex display
(don't display repeated elements in outer levels within groups)

Now the DataFrame will be printed as desired:

In [11]: pd.set_option('multi_sparse', False)

In [12]: df
Out[12]: 
one             A   A   A   A   A   A   A   A   A  A2  A2  A2  A2  A2  A2  A2  A2  A2
two             B   B   B  B2  B2  B2  B3  B3  B3   B   B   B  B2  B2  B2  B3  B3  B3
three           C  C2  C3   C  C2  C3   C  C2  C3   C  C2  C3   C  C2  C3   C  C2  C3
n location sex                                                                       
0 North    M    2   1   6   4   6   4   7   1   1   0   4   3   9   2   0   0   6   4
1 East     F    3   5   5   6   4   8   0   3   2   3   9   8   1   6   7   4   7   2
2 West     M    7   9   3   5   0   1   2   8   1   6   0   7   9   9   3   2   2   4
3 South    M    1   0   0   3   5   7   7   0   9   3   0   3   3   6   8   3   6   1
4 South    F    8   0   0   7   3   8   0   8   0   5   5   6   0   0   0   1   8   7
5 West     F    6   5   9   4   7   2   5   6   1   2   9   4   7   5   5   4   3   6
6 North    M    3   3   0   1   1   3   6   3   8   6   4   1   0   5   5   5   4   9
7 North    M    0   4   9   8   5   7   7   0   5   8   4   1   5   7   6   3   6   8
8 East     F    5   6   2   7   0   6   2   7   1   2   0   5   6   1   4   8   0   3
9 South    M    1   2   0   6   9   7   5   3   3   8   7   6   0   5   4   3   5   9

Note: in older pandas versions this was pd.set_printoptions(multi_sparse=False).

0 讨论(0)

别跟我提以往

2021-02-14 11:06
Not sure which version of pandas you are using but with 0.7.3 you can export your DataFrame to a TSV file and retain the indices by doing this:
```
df.to_csv('mydf.tsv', sep='\t')
```
The reason you need to export to TSV versus CSV is since the column headers have , characters in them. This should solve the first part of your question.

The second part gets a bit more tricky since from as far as I can tell, you need to beforehand have an idea of what you want your DataFrame to contain. In particular, you need to know:
1. Which columns on your TSV represent the row MultiIndex
2. and that the rest of the columns should also be converted to a MultiIndex
To illustrate this, lets read back the TSV file we saved above into a new DataFrame:
```
In [1]: t_df = read_table('mydf.tsv', index_col=[0,1,2])
In [2]: all(t_df.index == df.index)
Out[2]: True
```
So we managed to read mydf.tsv into a DataFrame that has the same row index as the original df. But:
```
In [3]: all(t_df.columns == df.columns)
Out[3]: False
```
And the reason here is because pandas (as far as I can tell) has no way of parsing the header row correctly into a MultiIndex. As I mentioned above, if you know beorehand that your TSV file header represents a MultiIndex then you can do the following to fix this:
```
In [4]: from ast import literal_eval
In [5]: t_df.columns = MultiIndex.from_tuples(t_df.columns.map(literal_eval).tolist(), 
                                              names=['one','two','three'])
In [6]: all(t_df.columns == df.columns)
Out[6]: True
```
0 讨论(0)
发布评论:

提交评论
- 加载中...