How to delete multiple pandas (python) dataframes from memory to save RAM?

后端 未结 3 631
隐瞒了意图╮
隐瞒了意图╮ 2020-11-27 12:22

I have lot of dataframes created as part of preprocessing. Since I have limited 6GB ram, I want to delete all the unnecessary dataframes from RAM to avoid running out of mem

相关标签:
3条回答
  • 2020-11-27 12:47

    This will delete the dataframe and will release the RAM/memory

    del [[df_1,df_2]]
    gc.collect()
    df_1=pd.DataFrame()
    df_2=pd.DataFrame()
    
    0 讨论(0)
  • 2020-11-27 12:50

    del statement does not delete an instance, it merely deletes a name.

    When you do del i, you are deleting just the name i - but the instance is still bound to some other name, so it won't be Garbage-Collected.

    If you want to release memory, your dataframes has to be Garbage-Collected, i.e. delete all references to them.

    If you created your dateframes dynamically to list, then removing that list will trigger Garbage Collection.

    >>> lst = [pd.DataFrame(), pd.DataFrame(), pd.DataFrame()]
    >>> del lst     # memory is released
    

    If you created some variables, you have to delete them all.

    >>> a, b, c = pd.DataFrame(), pd.DataFrame(), pd.DataFrame()
    >>> lst = [a, b, c]
    >>> del a, b, c # dfs still in list
    >>> del lst     # memory release now
    
    0 讨论(0)
  • 2020-11-27 13:13

    In python automatic garbage collection deallocates the variable (pandas DataFrame are also just another object in terms of python). There are different garbage collection strategies that can be tweaked (requires significant learning).

    You can manually trigger the garbage collection using

    import gc
    gc.collect()
    

    But frequent calls to garbage collection is discouraged as it is a costly operation and may affect performance.

    Reference

    0 讨论(0)
提交回复
热议问题