What is the point of views in pandas if it is undefined whether an indexing operation returns a view or a copy?

前端 未结 2 1010
误落风尘
误落风尘 2020-12-31 15:05

I have switched from R to pandas. I routinely get SettingWithCopyWarnings, when I do something like

df_a = pd.DataFram         


        
2条回答
  •  礼貌的吻别
    2020-12-31 15:47

    I agree this is a bit funny. My current practice is to look for a "functional" method for whatever I want to do (in my experience these almost always exist with the exception of renaming columns and series). Sometimes it makes the code more elegant, sometimes it makes it worse (I don't like assign with lambda), but at least I don't have to worry about mutability.

    So for indexing, instead of using the slice notation, you can use query which will return a copy by default:

    In [5]: df_a.query('col1 > 1')
    Out[5]:
       col1
    1     2
    2     3
    3     4
    

    I expand on it a little in this blog post.

    Edit: As raised in the comments, it looks like I'm wrong about query returning a copy by default, however if you use the assign style, then assign will make a copy before returning your result, and you're all good:

    df_b = (df_a.query('col1 > 1')
                .assign(newcol = 2*df_a['col1']))
    

提交回复
热议问题