Jupyter Lab freezes the computer when out of RAM - how to prevent it?

后端未结

关注

 7  1322

佛祖请我去吃肉 2021-02-04 08:11

I have recently started using Jupyter Lab and my problem is that I work with quite large datasets (usually the dataset itself is approx. 1/4 of my computer RAM). After few trans

7条回答

情深已故 (楼主)

2021-02-04 08:37

I think you should use chunks. Like that:

df_chunk = pd.read_csv(r'../input/data.csv', chunksize=1000000)
chunk_list = []  # append each chunk df here 

# Each chunk is in df format
for chunk in df_chunk:  
    # perform data filtering 
    chunk_filter = chunk_preprocessing(chunk)

    # Once the data filtering is done, append the chunk to list
    chunk_list.append(chunk_filter)

# concat the list into dataframe 
df_concat = pd.concat(chunk_list)

For more information check it out: https://towardsdatascience.com/why-and-how-to-use-pandas-with-large-data-9594dda2ea4c

I suggest don't append a list again(probably the RAM will overload again). You should finish your job in that for loop.

0 讨论(0)

查看其它7个回答