I have a python module that makes use of a huge dictionary global variable, currently I put the computation code in the top section, every first time import or reload of the mod
You could try using the marshal module instead of the c?Pickle one; it could be faster. This module is used by python to store values in a binary format. Note especially the following paragraph, to see if marshal fits your needs:
Not all Python object types are supported; in general, only objects whose value is independent from a particular invocation of Python can be written and read by this module. The following types are supported: None, integers, long integers, floating point numbers, strings, Unicode objects, tuples, lists, sets, dictionaries, and code objects, where it should be understood that tuples, lists and dictionaries are only supported as long as the values contained therein are themselves supported; and recursive lists and dictionaries should not be written (they will cause infinite loops).
Just to be on the safe side, before unmarshalling the dict, make sure that the Python version that unmarshals the dict is the same as the one that did the marshal, since there are no guarantees for backwards compatibility.
Expanding on the delayed-calculation idea, why not turn the dict into a class that supplies (and caches) elements as necessary?
You might also use psyco to speed up overall execution...
I assume you've pasted the dict literal into the source, and that's what's taking a minute? I don't know how to get around that, but you could probably avoid instantiating this dict upon import... You could lazily-instantiate it the first time it's actually used.
There's another pretty obvious solution for this problem. When code is reloaded the original scope is still available.
So... doing something like this will make sure this code is executed only once.
try:
FD
except NameError:
FD = FreqDist(word for word in brown.words())
A couple of things that will help speed up imports:
With that said, I agree that you shouldn't be experiencing any delay in importing modules after the first time you import it. Here are a couple of other general thoughts:
With that said, it's a tad bit difficult to give you any specific advice without a little bit more context. More specifically, where are you importing it? And what are the computations?
Calculate your global var on the first use.
class Proxy:
@property
def global_name(self):
# calculate your global var here, enable cache if needed
...
_proxy_object = Proxy()
GLOBAL_NAME = _proxy_object.global_name
Or better yet, access necessery data via special data object.
class Data:
GLOBAL_NAME = property(...)
data = Data()
Example:
from some_module import data
print(data.GLOBAL_NAME)
See Django settings.