问题
I have a fairly large codebase written in numba, and I have noticed that when the cache is enabled for a function calling another numba compiled function in another file, changes in the called function are not picked up when the called function is changed. The situation occurs when I have two files:
testfile2:
import numba
@numba.njit(cache=True)
def function1(x):
return x * 10
testfile:
import numba
from tests import file1
@numba.njit(cache=True)
def function2(x, y):
return y + file1.function1(x)
If in a jupyter notebook, I run the following:
# INSIDE JUPYTER NOTEBOOK
import sys
sys.path.insert(1, "path/to/files/")
from tests import testfile
testfile.function2(3, 4)
>>> 34 # good value
However, if I change then change testfile2 to the following:
import numba
@numba.njit(cache=True)
def function1(x):
return x * 1
Then I restart the jupyter notebook kernel and rerun the notebook, I get the following
import sys
sys.path.insert(1, "path/to/files/")
from tests import testfile
testfile.function2(3, 4)
>>> 34 # bad value, should be 7
Importing both files into the notebook has no effect on the bad result. Also, setting cache=False
only on function1
also has no effect. What does work is setting cache=False
on all njit'ted functions, then restarting the kernel, then rerunning.
I believe that LLVM is probably inlining some of the called functions and then never checking them again.
I looked in the source and discovered there is a method that returns the cache object numba.caching.NullCache()
, instantiated a cache object and ran the following:
cache = numba.caching.NullCache()
cache.flush()
Unfortunately that appears to have no effect.
Is there a numba environment setting, or another way I can manually clear all cached functions within a conda env? Or am I simply doing something wrong?
I am running numba 0.33 with Anaconda Python 3.6 on Mac OS X 10.12.3.
回答1:
I "solved" this with a hack solution after seeing Josh's answer, by creating a utility in the project method to kill off the cache.
There is probably a better way, but this works. I'm leaving the question open in case someone has a less hacky way of doing this.
import os
def kill_files(folder):
for the_file in os.listdir(folder):
file_path = os.path.join(folder, the_file)
try:
if os.path.isfile(file_path):
os.unlink(file_path)
except Exception as e:
print("failed on filepath: %s" % file_path)
def kill_numba_cache():
root_folder = os.path.realpath(__file__ + "/../../")
for root, dirnames, filenames in os.walk(root_folder):
for dirname in dirnames:
if dirname == "__pycache__":
try:
kill_files(root + "/" + dirname)
except Exception as e:
print("failed on %s", root)
回答2:
This is a bit of a hack, but it's something I've used before. If you put this function in the top-level of where your numba functions are (for this example, in testfile
), it should recompile everything:
import inspect
import sys
def recompile_nb_code():
this_module = sys.modules[__name__]
module_members = inspect.getmembers(this_module)
for member_name, member in module_members:
if hasattr(member, 'recompile') and hasattr(member, 'inspect_llvm'):
member.recompile()
and then call it from your jupyter notebook when you want to force a recompile. The caveat is that it only works on files in the module where this function is located and their dependencies. There might be another way to generalize it.
来源:https://stackoverflow.com/questions/44131691/how-to-clear-cache-or-force-recompilation-in-numba