Select from pandas dataframe using boolean series/array

前端 未结 1 1933
离开以前
离开以前 2021-02-03 22:59

I have a dataframe:

             High    Low  Close
Date                           
2009-02-11  30.20  29.41  29.87
2009-02-12  30.28  29.32  30.24
2009-02-13  3         


        
1条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-02-03 23:34

    For the indexing to work with two DataFrames they have to have comparable indexes. In this case it won't work because one DataFrame has an integer index, while the other has dates.

    However, as you say you can filter using a bool array. You can access the array for a Series via .values. This can be then applied as a filter as follows:

    df # pandas.DataFrame
    s  # pandas.Series 
    
    df[s.values] # df, filtered by the bool array in s
    

    For example, with your data:

    import pandas as pd
    
    df = pd.DataFrame([
                [30.20,  29.41,  29.87],
                [30.28,  29.32,  30.24],
                [30.45,  29.96,  30.10],
                [29.35,  28.74,  28.90],
                [29.35,  28.56,  28.92],
            ],
            columns=['High','Low','Close'], 
            index=['2009-02-11','2009-02-12','2009-02-13','2009-02-17','2009-02-18']
            )
    
    s = pd.Series([True, False, False, True, False], name='bools')
    
    df[s.values]
    

    Returns the following:

                High    Low     Close
    2009-02-11  30.20   29.41   29.87
    2009-02-17  29.35   28.74   28.90
    

    If you just want the High column, you can filter this as normal (before, or after the bool filter):

    df['High'][s.values]
    # Or: df[s.values]['High']
    

    To get your target output (as a Series):

     2009-02-11    30.20
     2009-02-17    29.35
     Name: High, dtype: float64
    

    0 讨论(0)
提交回复
热议问题