Merge the first row with the column headers in a dataframe

前端 未结 3 331
野性不改
野性不改 2021-01-13 01:56

I am trying to clean up a Excel file for some further research. Problem that I have, I want to merge the first and second row. The code which I have now:

xl         


        
相关标签:
3条回答
  • 2021-01-13 02:38

    I think you need numpy.concatenate, similar principe like cᴏʟᴅsᴘᴇᴇᴅ answer:

    df.columns = np.concatenate([df.iloc[0, :2], df.columns[2:]])
    df = df.iloc[1:].reset_index(drop=True)
    print (df)
      Sample type Concentration     A     B    C          D          E          F  \
    0       Water          9200  95.5  21.0  6.0  11.942308  64.134615  21.498560   
    1       Water          9200  94.5  17.0  5.0   5.484615  63.205769  19.658560   
    2       Water          9200  92.0  16.0  3.0  11.057692  62.586538  19.813120   
    3       Water          4600  53.0   7.5  2.5   3.538462  35.163462   6.876207   
    
              G         H  
    0  5.567840  1.174135  
    1  4.968000  1.883444  
    2  5.192480  0.564835  
    3  1.641724  0.144654  
    
    0 讨论(0)
  • 2021-01-13 02:40

    Fetch the all columns present in Second row header then First row header. combine them to make a "all columns name header" list. now create a df with excel by taking header as header[0,1]. now replace its headers with all column name headers you created previously.

    import pandas as pd
    
    #reading Second header row columns
    df1 = pd.read_excel('nanonose.xls', header=[1] , index = False)
    cols1 = df1.columns.tolist()
    SecondRowColumns = []
    for c in cols1:
        if ("Unnamed" or "NaN" not in c):
            SecondRowColumns.append(c)     
    
    #reading First header row columns
    df2 = pd.read_excel('nanonose.xls', header=[0] , index = False)
    cols2 = df2.columns.tolist()
    FirstRowColumns = []
    for c in cols2:
        if ("Unnamed" or "Nanonose" not in c):
            FirstRowColumns.append(c)       
    
    AllColumn = []
    AllColumn = SecondRowColumns+ FirstRowColumns
    
    
    
    df = pd.read_excel('nanonose.xls', header=[0,1] , index=False)
    df.columns = AllColumn
    print(df)
    
    0 讨论(0)
  • 2021-01-13 02:43

    Just reassign df.columns.

    df.columns = np.append(df.iloc[0, :2], df.columns[2:])
    

    Or,

    df.columns = df.iloc[0, :2].tolist() + (df.columns[2:]).tolist()
    

    Next, skip the first row.

    df = df.iloc[1:].reset_index(drop=True) 
    df
      Sample type Concentration     A     B    C          D          E          F  \
    0       Water          9200  95.5  21.0  6.0  11.942308  64.134615  21.498560   
    1       Water          9200  94.5  17.0  5.0   5.484615  63.205769  19.658560   
    2       Water          9200  92.0  16.0  3.0  11.057692  62.586538  19.813120   
    3       Water          4600  53.0   7.5  2.5   3.538462  35.163462   6.876207   
    
              G         H  
    0  5.567840  1.174135  
    1  4.968000  1.883444  
    2  5.192480  0.564835  
    3  1.641724  0.144654 
    

    reset_index is optional if you want a 0-index for your final output.

    0 讨论(0)
提交回复
热议问题