join or merge with overwrite in pandas

后端 未结 2 395
一整个雨季
一整个雨季 2020-12-02 11:22

I want to perform a join/merge/append operation on a dataframe with datetime index.

Let\'s say I have df1 and I want to add df2 to it.

相关标签:
2条回答
  • 2020-12-02 11:58

    How about: df2.combine_first(df1)?

    In [33]: df2
    Out[33]: 
                       A         B         C         D
    2000-01-03  0.638998  1.277361  0.193649  0.345063
    2000-01-04 -0.816756 -1.711666 -1.155077 -0.678726
    2000-01-05  0.435507 -0.025162 -1.112890  0.324111
    2000-01-06 -0.210756 -1.027164  0.036664  0.884715
    2000-01-07 -0.821631 -0.700394 -0.706505  1.193341
    2000-01-10  1.015447 -0.909930  0.027548  0.258471
    2000-01-11 -0.497239 -0.979071 -0.461560  0.447598
    
    In [34]: df1
    Out[34]: 
                       A         B         C
    2000-01-03  2.288863  0.188175 -0.040928
    2000-01-04  0.159107 -0.666861 -0.551628
    2000-01-05 -0.356838 -0.231036 -1.211446
    2000-01-06 -0.866475  1.113018 -0.001483
    2000-01-07  0.303269  0.021034  0.471715
    2000-01-10  1.149815  0.686696 -1.230991
    2000-01-11 -1.296118 -0.172950 -0.603887
    2000-01-12 -1.034574 -0.523238  0.626968
    2000-01-13 -0.193280  1.857499 -0.046383
    2000-01-14 -1.043492 -0.820525  0.868685
    
    In [35]: df2.comb
    df2.combine        df2.combineAdd     df2.combine_first  df2.combineMult    
    
    In [35]: df2.combine_first(df1)
    Out[35]: 
                       A         B         C         D
    2000-01-03  0.638998  1.277361  0.193649  0.345063
    2000-01-04 -0.816756 -1.711666 -1.155077 -0.678726
    2000-01-05  0.435507 -0.025162 -1.112890  0.324111
    2000-01-06 -0.210756 -1.027164  0.036664  0.884715
    2000-01-07 -0.821631 -0.700394 -0.706505  1.193341
    2000-01-10  1.015447 -0.909930  0.027548  0.258471
    2000-01-11 -0.497239 -0.979071 -0.461560  0.447598
    2000-01-12 -1.034574 -0.523238  0.626968       NaN
    2000-01-13 -0.193280  1.857499 -0.046383       NaN
    2000-01-14 -1.043492 -0.820525  0.868685       NaN
    

    Note that it takes the values from df1 for indices that do not overlap with df2. If this doesn't do exactly what you want I would be willing to improve this function / add options to it.

    0 讨论(0)
  • 2020-12-02 11:58

    For a merge like this, the update method of a DataFrame is useful.

    Taking the examples from the documentation:

    import pandas as pd
    import numpy as np
    
    df1 = pd.DataFrame([[np.nan, 3., 5.], [-4.6, 2.1, np.nan],
                       [np.nan, 7., np.nan]])
    df2 = pd.DataFrame([[-42.6, np.nan, -8.2], [-5., 1.6, 4]],
                       index=[1, 2])
    

    Data before the update:

    >>> df1
         0    1    2
    0  NaN  3.0  5.0
    1 -4.6  2.1  NaN
    2  NaN  7.0  NaN
    >>>
    >>> df2
          0    1    2
    1 -42.6  NaN -8.2
    2  -5.0  1.6  4.0
    

    Let's update df1 with data from df2:

    df1.update(df2)
    

    Data after the update:

    >>> df1
          0    1    2
    0   NaN  3.0  5.0
    1 -42.6  2.1 -8.2
    2  -5.0  1.6  4.0
    

    Remarks:

    • It's important to notice that this is an operation "in place", modifying the DataFrame that calls update.
    • Also note that non NaN values in df1 are not overwritten with NaN values in df2
    0 讨论(0)
提交回复
热议问题