Jupyter Lab freezes the computer when out of RAM - how to prevent it?

后端 未结 7 1322
佛祖请我去吃肉
佛祖请我去吃肉 2021-02-04 08:11

I have recently started using Jupyter Lab and my problem is that I work with quite large datasets (usually the dataset itself is approx. 1/4 of my computer RAM). After few trans

7条回答
  •  情深已故
    2021-02-04 08:37

    I think you should use chunks. Like that:

    df_chunk = pd.read_csv(r'../input/data.csv', chunksize=1000000)
    chunk_list = []  # append each chunk df here 
    
    # Each chunk is in df format
    for chunk in df_chunk:  
        # perform data filtering 
        chunk_filter = chunk_preprocessing(chunk)
    
        # Once the data filtering is done, append the chunk to list
        chunk_list.append(chunk_filter)
    
    # concat the list into dataframe 
    df_concat = pd.concat(chunk_list)
    

    For more information check it out: https://towardsdatascience.com/why-and-how-to-use-pandas-with-large-data-9594dda2ea4c

    I suggest don't append a list again(probably the RAM will overload again). You should finish your job in that for loop.

提交回复
热议问题