Fastest way to calculate in Pandas?

后端 未结 3 1087
耶瑟儿~
耶瑟儿~ 2021-01-27 03:44

Given these two dataframes:

df1 =
     Name  Start  End
  0  A     10     20
  1  B     20     30
  2  C     30     40

df2 =
     0   1
  0  5   10
  1  15  20
         


        
3条回答
  •  孤独总比滥情好
    2021-01-27 04:17

    This is one way to go about it:

     #create numpy arrays of df1 and 2
    
    df1_start = df1.loc[:,'Start'].to_numpy()
    df1_end = df1.loc[:,'End'].to_numpy()
    
    df2_start = df2[0].to_numpy()
    df2_end = df2[1].to_numpy()
    
    #use np tile to create shapes
    #that allow element wise subtraction
    tiled_start = np.tile(df1_start,(len(df2),1)).T
    tiled_end = np.tile(df1_end,(len(df2),1)).T
    
    #subtract df2 from df1
    start = np.subtract(tiled_start,df2_start)
    end = np.subtract(tiled_end, df2_end)
    
    #create columns for start and end
    start_columns = [f'Start_Diff_{num}' for num in range(len(df2))]
    end_columns = [f'End_Diff_{num}' for num in range(len(df2))]
    
    #create dataframes of start and end
    start_df = pd.DataFrame(start,columns=start_columns)
    end_df = pd.DataFrame(end, columns = end_columns)
    
    #lump start and end into one dataframe
    lump = pd.concat([start_df,end_df],axis=1)
    
    #sort the columns by the digits at the end
    filtered = final.columns[final.columns.str.contains('\d')]
    
    cols = sorted(filtered, key = lambda x: x[-1])
    
    lump = lump.reindex(cols,axis='columns')
    
    #hook lump back to df1
    final = pd.concat([df1,lump],axis=1)
    

提交回复
热议问题