Fastest way to calculate in Pandas?

后端 未结 3 1092
耶瑟儿~
耶瑟儿~ 2021-01-27 03:44

Given these two dataframes:

df1 =
     Name  Start  End
  0  A     10     20
  1  B     20     30
  2  C     30     40

df2 =
     0   1
  0  5   10
  1  15  20
         


        
3条回答
  •  野趣味
    野趣味 (楼主)
    2021-01-27 04:38

    Don't use iterrows(). If you're simply subtracting values, use vectorization with Numpy (Pandas also offers vectorization, but Numpy is faster).

    For instance:

    df2 = pd.DataFrame([[5, 10], [15, 20], [25, 30]], columns=None)
    
    col_names = "Start_Diff_1 End_Diff_1".split()
    df3 = pd.DataFrame(df2.to_numpy() - 10, columns=colnames)
    

    Here df3 equals:

        Start_Diff_1    End_Diff_1
    0           -5              0
    1           5               10
    2           15              20
    

    You can also change column names by doing:

    df2.columns = "Start_Diff_0 End_Diff_0".split()
    

    You can use f-strings to change column names in a loop, i.e., f"Start_Diff_{i}", where i is a number in a loop

    You can also combine multiple dataframes with:

    df = pd.concat([df1, df2],axis=1)
    

提交回复
热议问题