Prevent Python from caching the imported modules

旧巷老猫 提交于 2019-12-17 06:38:37

问题


While developing a largeish project (split in several files and folders) in Python with IPython, I run into the trouble of cached imported modules.

The problem is that instructions import module only reads the module once, even if that module has changed! So each time I change something in my package, I have to quit and restart IPython. Painful.

Is there any way to properly force reloading some modules? Or, better, to somehow prevent Python from caching them?

I tried several approaches, but none works. In particular I run into really, really weird bugs, like some modules or variables mysteriously becoming equal to None...

The only sensible resource I found is Reloading Python modules, from pyunit, but I have not checked it. I would like something like that.

A good alternative would be for IPython to restart, or restart the Python interpreter somehow.

So, if you develop in Python, what solution have you found to this problem?

Edit

To make things clear: obviously, I understand that some old variables depending on the previous state of the module may stick around. That's fine by me. By why is that so difficult in Python to force reload a module without having all sort of strange errors happening?

More specifically, if I have my whole module in one file module.py then the following works fine:

import sys
try:
    del sys.modules['module']
except AttributeError:
    pass
import module

obj = module.my_class()

This piece of code works beautifully and I can develop without quitting IPython for months.

However, whenever my module is made of several submodules, hell breaks loose:

import os
for mod in ['module.submod1', 'module.submod2']:
    try:
        del sys.module[mod]
    except AttributeError:
        pass
# sometimes this works, sometimes not. WHY?

Why is that so different for Python whether I have my module in one big file or in several submodules? Why would that approach not work??


回答1:


Quitting and restarting the interpreter is the best solution. Any sort of live reloading or no-caching strategy will not work seamlessly because objects from no-longer-existing modules can exist and because modules sometimes store state and because even if your use case really does allow hot reloading it's too complicated to think about to be worth it.




回答2:


import checks to see if the module is in sys.modules, and if it is, it returns it. If you want import to load the module fresh from disk, you can delete the appropriate key in sys.modules first.

There is the reload builtin function which will, given a module object, reload it from disk and that will get placed in sys.modules. Edit -- actually, it will recompile the code from the file on the disk, and then re-evalute it in the existing module's __dict__. Something potentially very different than making a new module object.

Mike Graham is right though; getting reloading right if you have even a few live objects that reference the contents of the module you don't want anymore is hard. Existing objects will still reference the classes they were instantiated from is an obvious issue, but also all references created by means of from module import symbol will still point to whatever object from the old version of the module. Many subtly wrong things are possible.

Edit: I agree with the consensus that restarting the interpreter is by far the most reliable thing. But for debugging purposes, I guess you could try something like the following. I'm certain that there are corner cases for which this wouldn't work, but if you aren't doing anything too crazy (otherwise) with module loading in your package, it might be useful.

def reload_package(root_module):
    package_name = root_module.__name__

    # get a reference to each loaded module
    loaded_package_modules = dict([
        (key, value) for key, value in sys.modules.items() 
        if key.startswith(package_name) and isinstance(value, types.ModuleType)])

    # delete references to these loaded modules from sys.modules
    for key in loaded_package_modules:
        del sys.modules[key]

    # load each of the modules again; 
    # make old modules share state with new modules
    for key in loaded_package_modules:
        print 'loading %s' % key
        newmodule = __import__(key)
        oldmodule = loaded_package_modules[key]
        oldmodule.__dict__.clear()
        oldmodule.__dict__.update(newmodule.__dict__)

Which I very briefly tested like so:

import email, email.mime, email.mime.application
reload_package(email)

printing:

reloading email.iterators
reloading email.mime
reloading email.quoprimime
reloading email.encoders
reloading email.errors
reloading email
reloading email.charset
reloading email.mime.application
reloading email._parseaddr
reloading email.utils
reloading email.mime.base
reloading email.message
reloading email.mime.nonmultipart
reloading email.base64mime



回答3:


With IPython comes the autoreload extension that automatically repeats an import before each function call. It works at least in simple cases, but don't rely too much on it: in my experience, an interpreter restart is still required from time to time, especially when code changes occur only on indirectly imported code.

Usage example from the linked page:

In [1]: %load_ext autoreload

In [2]: %autoreload 2

In [3]: from foo import some_function

In [4]: some_function()
Out[4]: 42

In [5]: # open foo.py in an editor and change some_function to return 43

In [6]: some_function()
Out[6]: 43



回答4:


There are some really good answers here already, but it is worth knowing about dreload, which is a function available in IPython which does as "deep reload". From the documentation:

The IPython.lib.deepreload module allows you to recursively reload a module: changes made to any of its dependencies will be reloaded without having to exit. To start using it, do:

http://ipython.org/ipython-doc/dev/interactive/reference.html#dreload

It is available as a "global" in IPython notebook (at least my version, which is running v2.0).

HTH




回答5:


You can use import hook machinery described in PEP 302 to load not modules themself but some kind of proxy object that will allow you to do anything you want with underlying module object — reload it, drop reference to it etc.

Additional benefit is that your currently existing code will not require change and this additional module functionality can be torn off from a single point in code — where you actually add finder into sys.meta_path.

Some thoughts on implementing: create finder that will agree to find any module, except of builtin (you have nothing to do with builtin modules), then create loader that will return proxy object subclassed from types.ModuleType instead of real module object. Note that loader object are not forced to create explicit references to loaded modules into sys.modules, but it's strongly encouraged, because, as you have already seen, it may fail unexpectably. Proxy object should catch and forward all __getattr__, __setattr__ and __delattr__ to underlying real module it's keeping reference to. You will probably don't need to define __getattribute__ because of you would not hide real module contents with your proxy methods. So, now you should communicate with proxy in some way — you can create some special method to drop underlying reference, then import module, extract reference from returned proxy, drop proxy and hold reference to reloaded module. Phew, looks scary, but should fix your problem without reloading Python each time.




回答6:


I am using PythonNet in my project. Fortunately, I found there is a command which can perfectly solve this problem.

using (Py.GIL())
        {
            dynamic mod = Py.Import(this.moduleName);
            if (mod == null)
                throw new Exception( string.Format("Cannot find module {0}. Python script may not be complied successfully or module name is illegal.", this.moduleName));

            // This command works perfect for me!
            PythonEngine.ReloadModule(mod);

            dynamic instance = mod.ClassName();



回答7:


Think twice for quitting and restarting in production

The easy solution without quitting & restarting is by using the reload from imp

import moduleA, moduleB
from imp import reload
reload (moduleB)



回答8:


For Python version 3.4 and above

import importlib 
importlib.reload(<package_name>) 
from <package_name> import <method_name>

Refer below documentation for details.



来源:https://stackoverflow.com/questions/2918898/prevent-python-from-caching-the-imported-modules

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!