Pandas KeyError: value not in index

后端 未结 4 940
伪装坚强ぢ
伪装坚强ぢ 2020-12-08 14:06

I have the following code,

df = pd.read_csv(CsvFileName)

p = df.pivot_table(index=[\'Hour\'], columns=\'DOW\', values=\'Changes\', aggfunc=np.mean).round(0         


        
相关标签:
4条回答
  • 2020-12-08 14:52

    Use reindex to get all columns you need. It'll preserve the ones that are already there and put in empty columns otherwise.

    p = p.reindex(columns=['1Sun', '2Mon', '3Tue', '4Wed', '5Thu', '6Fri', '7Sat'])
    

    So, your entire code example should look like this:

    df = pd.read_csv(CsvFileName)
    
    p = df.pivot_table(index=['Hour'], columns='DOW', values='Changes', aggfunc=np.mean).round(0)
    p.fillna(0, inplace=True)
    
    columns = ["1Sun", "2Mon", "3Tue", "4Wed", "5Thu", "6Fri", "7Sat"]
    p = p.reindex(columns=columns)
    p[columns] = p[columns].astype(int)
    
    0 讨论(0)
  • 2020-12-08 14:59

    I had a very similar issue. I got the same error because the csv contained spaces in the header. My csv contained a header "Gender " and I had it listed as:

    [['Gender']]
    

    If it's easy enough for you to access your csv, you can use the excel formula trim() to clip any spaces of the cells.

    or remove it like this

    df.columns = df.columns.to_series().apply(lambda x: x.strip())

    0 讨论(0)
  • 2020-12-08 15:03

    I had the same issue.

    During the 1st development I used a .csv file (comma as separator) that I've modified a bit before saving it. After saving the commas became semicolon.

    On Windows it is dependent on the "Regional and Language Options" customize screen where you find a List separator. This is the char Windows applications expect to be the CSV separator.

    When testing from a brand new file I encountered that issue.

    I've removed the 'sep' argument in read_csv method before:

    df1 = pd.read_csv('myfile.csv', sep=',');
    

    after:

    df1 = pd.read_csv('myfile.csv');
    

    That way, the issue disappeared.

    0 讨论(0)
  • 2020-12-08 15:07

    please try this to clean and format your column names:

    df.columns = (df.columns.str.strip().str.upper()
                  .str.replace(' ', '_')
                  .str.replace('(', '')
                  .str.replace(')', ''))
    
    0 讨论(0)
提交回复
热议问题