How to clear cache (or force recompilation) in numba

问题

I have a fairly large codebase written in numba, and I have noticed that when the cache is enabled for a function calling another numba compiled function in another file, changes in the called function are not picked up when the called function is changed. The situation occurs when I have two files:

testfile2:

import numba

@numba.njit(cache=True)
def function1(x):
    return x * 10

testfile:

import numba
from tests import file1

@numba.njit(cache=True)
def function2(x, y):
    return y + file1.function1(x)

If in a jupyter notebook, I run the following:

# INSIDE JUPYTER NOTEBOOK
import sys
sys.path.insert(1, "path/to/files/")
from tests import testfile

testfile.function2(3, 4)
>>> 34   # good value

However, if I change then change testfile2 to the following:

import numba

@numba.njit(cache=True)
def function1(x):
    return x * 1

Then I restart the jupyter notebook kernel and rerun the notebook, I get the following

import sys
sys.path.insert(1, "path/to/files/")
from tests import testfile

testfile.function2(3, 4)
>>> 34   # bad value, should be 7

Importing both files into the notebook has no effect on the bad result. Also, setting cache=False only on function1 also has no effect. What does work is setting cache=False on all njit'ted functions, then restarting the kernel, then rerunning.

I believe that LLVM is probably inlining some of the called functions and then never checking them again.

I looked in the source and discovered there is a method that returns the cache object numba.caching.NullCache(), instantiated a cache object and ran the following:

cache = numba.caching.NullCache()
cache.flush()

Unfortunately that appears to have no effect.

Is there a numba environment setting, or another way I can manually clear all cached functions within a conda env? Or am I simply doing something wrong?

I am running numba 0.33 with Anaconda Python 3.6 on Mac OS X 10.12.3.

回答1:

I "solved" this with a hack solution after seeing Josh's answer, by creating a utility in the project method to kill off the cache.

There is probably a better way, but this works. I'm leaving the question open in case someone has a less hacky way of doing this.

import os


def kill_files(folder):
    for the_file in os.listdir(folder):
        file_path = os.path.join(folder, the_file)
        try:
            if os.path.isfile(file_path):
                os.unlink(file_path)
        except Exception as e:
            print("failed on filepath: %s" % file_path)


def kill_numba_cache():

    root_folder = os.path.realpath(__file__ + "/../../")

    for root, dirnames, filenames in os.walk(root_folder):
        for dirname in dirnames:
            if dirname == "__pycache__":
                try:
                    kill_files(root + "/" + dirname)
                except Exception as e:
                    print("failed on %s", root)

回答2:

This is a bit of a hack, but it's something I've used before. If you put this function in the top-level of where your numba functions are (for this example, in testfile), it should recompile everything:

import inspect
import sys

def recompile_nb_code():
    this_module = sys.modules[__name__]
    module_members = inspect.getmembers(this_module)

    for member_name, member in module_members:
        if hasattr(member, 'recompile') and hasattr(member, 'inspect_llvm'):
            member.recompile()

and then call it from your jupyter notebook when you want to force a recompile. The caveat is that it only works on files in the module where this function is located and their dependencies. There might be another way to generalize it.

来源：https://stackoverflow.com/questions/44131691/how-to-clear-cache-or-force-recompilation-in-numba

标签

Anaconda

numba