I have recently started using Jupyter Lab and my problem is that I work with quite large datasets (usually the dataset itself is approx. 1/4 of my computer RAM). After few trans
I also work with very large datasets (3GB) on Jupyter Lab and have been experiencing the same issue on Labs.
It's unclear if you need to maintain access to the pre-transformed data, if not, I've started using del
of unused large dataframe variables if I don't need them. del
removes variables from your memory. Edit** : there a multiple possibilities for the issue I'm encountering. I encounter this more often when I'm using a remote jupyter instance, and in spyder as well when I'm perfoming large transformations.
e.g.
df = pd.read('some_giant_dataframe') # or whatever your import is
new_df = my_transform(df)
del df # if unneeded.
Jakes you may also find this thread on large data workflows helpful. I've been looking into Dask to help with memory storage.
I've noticed in spyder and jupyter that the freezeup will usually happen when working in another console while a large memory console runs. As to why it just freezes up instead of crashing out, I think this has something to do with the kernel. There are a couple memory issues open in the IPython github - #10082 and #10117 seem most relevant. One user here suggest disabling tab completion in jedi
or updating jedi.
In 10117 they propose checking the output of get_ipython().history_manager.db_log_output
. I have the same issues and my setting is correct, but it's worth checking