Preserving column order in the pandas to_csv method

后端 未结 3 1925
走了就别回头了
走了就别回头了 2021-02-13 09:16

The to_csv method of pandas does not preserve the order of columns. It chooses to alphabetically arrange the columns in CSV. This is a bug and has been reported and is supposed

3条回答
  •  陌清茗
    陌清茗 (楼主)
    2021-02-13 09:39

    I think problem is in DataFrame constructor, because you need add parameter columns for custom ordering of columns. If you dont set parameter columns, columns are ordered alphanumerical.

    import pandas as pd
    df = pd.DataFrame({'V_pod_error' : [0,2],
                       'V_pod_used' : [6,4],
                       'U_sol_type' : [7,8]})
    print df
       U_sol_type  V_pod_error  V_pod_used
    0           7            0           6
    1           8            2           4
    
    print df.to_csv()
    ,U_sol_type,V_pod_error,V_pod_used
    0,7,0,6
    1,8,2,4
    
    
    df1 = pd.DataFrame({'V_pod_error' : [0,2],
                       'V_pod_used' : [6,4],
                       'U_sol_type' : [7,8]}, 
                        columns=['V_pod_error','V_pod_used','U_sol_type'])
    
    print df1
       V_pod_error  V_pod_used  U_sol_type
    0            0           6           7
    1            2           4           8
    
    print df1.to_csv()
    ,V_pod_error,V_pod_used,U_sol_type
    0,0,6,7
    1,2,4,8
    

    EDIT:

    Another solution is set order of column by subset before write to_csv (thanks Mathias711):

    import pandas as pd
    df = pd.DataFrame({'V_pod_error' : [0,2],
                       'V_pod_used' : [6,4],
                       'U_sol_type' : [7,8]})
    print df
       U_sol_type  V_pod_error  V_pod_used
    0           7            0           6
    1           8            2           4
    
    df = df[['V_pod_error','V_pod_used','U_sol_type']]
    print df
    
       V_pod_error  V_pod_used  U_sol_type
    0            0           6           7
    1            2           4           8
    

    EDIT1: Maybe help first convert dict to OrderedDict and then create DataFrame:

    import collections
    import pandas as pd
    
    
    d = {'V_pod_error' : [0,2],'V_pod_used' : [6,4], 'U_sol_type' : [7,8]}
    print d
    {'V_pod_error': [0, 2], 'V_pod_used': [6, 4], 'U_sol_type': [7, 8]}
    
    print pd.DataFrame(d)
       U_sol_type  V_pod_error  V_pod_used
    0           7            0           6
    1           8            2           4
    
    d1 = collections.OrderedDict(d)
    print d1
    OrderedDict([('V_pod_error', [0, 2]), ('V_pod_used', [6, 4]), ('U_sol_type', [7, 8])])
    
    print pd.DataFrame(d1)
       V_pod_error  V_pod_used  U_sol_type
    0            0           6           7
    1            2           4           8
    

提交回复
热议问题