问题
Indexing into a Pandas DataFrame throws an error when:
- The query has multiple copies of the same value.
- The index in the DataFrame is a single-level MultiIndex.
The below code shows a simple example
import pandas as pd
# columns as str class -- works
D = pd.DataFrame([[1,2]], columns = ['A','B'], index = ['R1'])
print(D.loc[:,['A','A']], '\n') # OK
# columns as Index class -- works
D = pd.DataFrame([[1,2]], columns = pd.Index(['A','B']), index = ['R1'])
print(D.loc[:,['A','A']], '\n') # OK
# columns as single-level MultiIndex -- fails
D = pd.DataFrame([[1,2]], columns = pd.MultiIndex.from_arrays([['A','B']]), index = ['R1'])
print(D.loc[:,['A','A']]) # error "Index._join_level on non-unique index is not implemented"
In the last case, D.columns
is
MultiIndex([('A',),
('B',)],
)
Why does this occur? What is the easiest way to fix it?
In my program, I do not construct the DataFrame columns directly (as I do in the simple example above); instead, it is obtained via a drop_level command (producing the single-level MultiIndex). So, simply creating a DataFrame using Index class is not an option.
来源:https://stackoverflow.com/questions/65379153/flatten-a-single-leveled-multiindex