How to cache in IPython Notebook?

前端 未结 4 1115
借酒劲吻你
借酒劲吻你 2021-02-07 05:30

Environment:

  • Python 3
  • IPython 3.2

Every time I shut down a IPython notebook and re-open it, I have to re-run all the cells. But some cells

相关标签:
4条回答
  • 2021-02-07 06:03

    Unfortunately, it doesn't seem like there is something as convenient as an automatic cache. The %store magic option is close, but requires you to do the caching and reloading manually and explicitly.

    In your Jupyter notebook:

    a = 1
    %store a
    

    Now, let's say you close the notebook and the kernel gets restarted. You no longer have access to the local variables. However, you can reload the variables you've stored using the -r option.

    %store -r a
    print a # Should print 1
    
    0 讨论(0)
  • 2021-02-07 06:12

    Use the cache magic.

    %cache myVar = someSlowCalculation(some, "parameters")
    

    This will calculate someSlowCalculation(some, "parameters") once. And in subsequent calls it restores myVar from storage.

    https://pypi.org/project/ipython-cache/

    Under the hood it does pretty much the same as the accepted answer.

    0 讨论(0)
  • 2021-02-07 06:15

    In fact the functionality you ask is already there, no need to re-implement it manually by doing your dumps .

    You can use the use the %store or maybe better the %%cache magic (extension) to store the results of these intermittently cells, so they don't have to be recomputed (see https://github.com/rossant/ipycache)

    It is as simple as:

    %load_ext ipycache
    

    Then, in a cell e.g.:

    %%cache mycache.pkl var1 var2
    var1 = 1
    var2 = 2
    

    When you execute this cell the first time, the code is executed, and the variables var1 and var2 are saved in mycache.pkl in the current directory along with the outputs. Rich display outputs are only saved if you use the development version of IPython. When you execute this cell again, the code is skipped, the variables are loaded from the file and injected into the namespace, and the outputs are restored in the notebook.

    It saves all graphics, output produced, and all the variables specified automatically for you :)

    0 讨论(0)
  • 2021-02-07 06:27

    Can you give an example of what you are trying to do? When I run something in an IPython Notebook that is expensive I almost always write it to disk afterword. For example, if my data is a list of JSON object, I write it to disk as line separated JSON formatted strings:

    with open('path_to_file.json', 'a') as file:
        for item in data: 
            line = json.dumps(item)
            file.write(line + '\n')
    

    You can then read back in the data the same way:

    data = []
    with open('path_to_file.json', 'a') as file:
        for line in file: 
            data_item = json.loads(line)
            data.append(data_item)
    

    I think this is a good practice generally speaking because it provides you a backup. You can also use pickle for the same thing. If your data is really big you can actually gzip.open to directly write to a zip file.

    EDIT

    To save a scikit learn model to disk use joblib.pickle.

    from sklearn.cluster import KMeans
    
    km = KMeans(n_clusters=num_clusters)
    km.fit(some_data)
    
    
    from sklearn.externals import joblib
    # dump to pickle
    joblib.dump(km, 'model.pkl')
    
    # and reload from pickle
    km = joblib.load('model.pkl')
    
    0 讨论(0)
提交回复
热议问题