How to get the common index of two pandas dataframes?

前端 未结 8 1386
野趣味
野趣味 2021-02-19 03:13

I have two pandas DataFrames df1 and df2 and I want to transform them in order that they keep values only for the index that are common to the 2 dataframes.

df1

相关标签:
8条回答
  • 2021-02-19 03:44

    I found pd.Index and set combination much faster than numpy.intersect1d as well df1.index.intersection(df2.index). Here is what I used:

    df2 = df2.loc[pd.Index(set(df1.index)&set(df2.index))]

    0 讨论(0)
  • 2021-02-19 03:47

    Have you tried something like

    df1 = df1.loc[[x for x in df1.index if x in df2.index]]
    df2 = df2.loc[[x for x in df2.index if x in df1.index]]
    
    0 讨论(0)
  • 2021-02-19 03:54
    In [352]: common = df1.index.intersection(df2.index)
    
    In [353]: df1.loc[common]
    Out[353]:
                 values1
    0
    28/11/2000 -0.055276
    29/11/2000  0.027427
    30/11/2000  0.066009
    
    In [354]: df2.loc[common]
    Out[354]:
                 values2
    0
    28/11/2000 -0.026316
    29/11/2000  0.015222
    30/11/2000 -0.024480
    
    0 讨论(0)
  • 2021-02-19 03:56

    And, using isin. intersection might be faster though.

    In [286]: df1.loc[df1.index.isin(df2.index)]
    Out[286]:
                 values1
    0
    28/11/2000 -0.055276
    29/11/2000  0.027427
    30/11/2000  0.066009
    
    In [287]: df2.loc[df2.index.isin(df1.index)]
    Out[287]:
                 values2
    0
    28/11/2000 -0.026316
    29/11/2000  0.015222
    30/11/2000 -0.024480
    
    0 讨论(0)
  • 2021-02-19 03:58

    You can use Index.intersection + DataFrame.loc:

    idx = df1.index.intersection(df2.index)
    print (idx)
    Index(['28/11/2000', '29/11/2000', '30/11/2000'], dtype='object')
    

    Alternative solution with numpy.intersect1d:

    idx = np.intersect1d(df1.index, df2.index)
    print (idx)
    ['28/11/2000' '29/11/2000' '30/11/2000']
    

    df1 = df1.loc[idx]
    print (df1)
                values 1
    28/11/2000 -0.055276
    29/11/2000  0.027427
    30/11/2000  0.066009
    
    df2 = df2.loc[idx]
    
    0 讨论(0)
  • 2021-02-19 03:58

    reindex + dropna

    df1.reindex(df2.index).dropna()
    Out[21]: 
                 values1
    28/11/2000 -0.055276
    29/11/2000  0.027427
    30/11/2000  0.066009
    
    
    df2.reindex(df1.index).dropna()
    Out[22]: 
                 values2
    28/11/2000 -0.026316
    29/11/2000  0.015222
    30/11/2000 -0.024480
    
    0 讨论(0)
提交回复
热议问题