How are iloc and loc different?

前端 未结 3 785
孤城傲影
孤城傲影 2020-11-21 06:36

Can someone explain how these two methods of slicing are different?
I\'ve seen the docs, and I\'ve seen these answers, but I still find myself unable to explain how th

3条回答
  •  迷失自我
    2020-11-21 06:50

    iloc works based on integer positioning. So no matter what your row labels are, you can always, e.g., get the first row by doing

    df.iloc[0]
    

    or the last five rows by doing

    df.iloc[-5:]
    

    You can also use it on the columns. This retrieves the 3rd column:

    df.iloc[:, 2]    # the : in the first position indicates all rows
    

    You can combine them to get intersections of rows and columns:

    df.iloc[:3, :3] # The upper-left 3 X 3 entries (assuming df has 3+ rows and columns)
    

    On the other hand, .loc use named indices. Let's set up a data frame with strings as row and column labels:

    df = pd.DataFrame(index=['a', 'b', 'c'], columns=['time', 'date', 'name'])
    

    Then we can get the first row by

    df.loc['a']     # equivalent to df.iloc[0]
    

    and the second two rows of the 'date' column by

    df.loc['b':, 'date']   # equivalent to df.iloc[1:, 1]
    

    and so on. Now, it's probably worth pointing out that the default row and column indices for a DataFrame are integers from 0 and in this case iloc and loc would work in the same way. This is why your three examples are equivalent. If you had a non-numeric index such as strings or datetimes, df.loc[:5] would raise an error.

    Also, you can do column retrieval just by using the data frame's __getitem__:

    df['time']    # equivalent to df.loc[:, 'time']
    

    Now suppose you want to mix position and named indexing, that is, indexing using names on rows and positions on columns (to clarify, I mean select from our data frame, rather than creating a data frame with strings in the row index and integers in the column index). This is where .ix comes in:

    df.ix[:2, 'time']    # the first two rows of the 'time' column
    

    I think it's also worth mentioning that you can pass boolean vectors to the loc method as well. For example:

     b = [True, False, True]
     df.loc[b] 
    

    Will return the 1st and 3rd rows of df. This is equivalent to df[b] for selection, but it can also be used for assigning via boolean vectors:

    df.loc[b, 'name'] = 'Mary', 'John'
    

提交回复
热议问题