Understanding inplace=True

后端 未结 11 1037
遥遥无期
遥遥无期 2020-11-22 03:25

In the pandas library many times there is an option to change the object inplace such as with the following statement...

df.dropna(axis=\'index\         


        
相关标签:
11条回答
  • 2020-11-22 03:26

    If you don't use inplace=True or you use inplace=False you basically get back a copy.

    So for instance:

    testdf.sort_values(inplace=True, by='volume', ascending=False)
    

    will alter the structure with the data sorted in descending order.

    then:

    testdf2 = testdf.sort_values( by='volume', ascending=True)
    

    will make testdf2 a copy. the values will all be the same but the sort will be reversed and you will have an independent object.

    then given another column, say LongMA and you do:

    testdf2.LongMA = testdf2.LongMA -1
    

    the LongMA column in testdf will have the original values and testdf2 will have the decrimented values.

    It is important to keep track of the difference as the chain of calculations grows and the copies of dataframes have their own lifecycle.

    0 讨论(0)
  • 2020-11-22 03:27

    inplace=True makes the function impure. It changes the original dataframe and returns None. In that case, You breaks the DSL chain. Because most of dataframe functions return a new dataframe, you can use the DSL conveniently. Like

    df.sort_values().rename().to_csv()
    

    Function call with inplace=True returns None and DSL chain is broken. For example

    df.sort_values(inplace=True).rename().to_csv()
    

    will throw NoneType object has no attribute 'rename'

    Something similar with python’s build-in sort and sorted. lst.sort() returns None and sorted(lst) returns a new list.

    Generally, do not use inplace=True unless you have specific reason of doing so. When you have to write reassignment code like df = df.sort_values(), try attaching the function call in the DSL chain, e.g.

    df = pd.read_csv().sort_values()...
    
    0 讨论(0)
  • 2020-11-22 03:30

    Yes, in Pandas we have many functions has the parameter inplace but by default it is assigned to False.

    So, when you do df.dropna(axis='index', how='all', inplace=False) it thinks that you do not want to change the orignial DataFrame, therefore it instead creates a new copy for you with the required changes.

    But, when you change the inplace parameter to True

    Then it is equivalent to explicitly say that I do not want a new copy of the DataFrame instead do the changes on the given DataFrame

    This forces the Python interpreter to not to create a new DataFrame

    But you can also avoid using the inplace parameter by reassigning the result to the orignal DataFrame

    df = df.dropna(axis='index', how='all')

    0 讨论(0)
  • 2020-11-22 03:34

    When inplace=True is passed, the data is renamed in place (it returns nothing), so you'd use:

    df.an_operation(inplace=True)
    

    When inplace=False is passed (this is the default value, so isn't necessary), performs the operation and returns a copy of the object, so you'd use:

    df = df.an_operation(inplace=False) 
    
    0 讨论(0)
  • 2020-11-22 03:34

    Save it to the same variable

    data["column01"].where(data["column01"]< 5, inplace=True)

    Save it to a separate variable

    data["column02"] = data["column01"].where(data["column1"]< 5)

    But, you can always overwrite the variable

    data["column01"] = data["column01"].where(data["column1"]< 5)

    FYI: In default inplace = False

    0 讨论(0)
  • 2020-11-22 03:34

    inplace=True is used depending if you want to make changes to the original df or not.

    df.drop_duplicates()
    

    will only make a view of dropped values but not make any changes to df

    df.drop_duplicates(inplace  = True)
    

    will drop values and make changes to df.

    Hope this helps.:)

    0 讨论(0)
提交回复
热议问题