How to avoid computation every time a python module is reloaded

后端 未结 13 707
温柔的废话
温柔的废话 2021-02-06 10:55

I have a python module that makes use of a huge dictionary global variable, currently I put the computation code in the top section, every first time import or reload of the mod

13条回答
  •  无人及你
    2021-02-06 11:20

    Just to clarify: the code in the body of a module is not executed every time the module is imported - it is run only once, after which future imports find the already created module, rather than recreating it. Take a look at sys.modules to see the list of cached modules.

    However, if your problem is the time it takes for the first import after the program is run, you'll probably need to use some other method than a python dict. Probably best would be to use an on-disk form, for instance a sqlite database, one of the dbm modules.

    For a minimal change in your interface, the shelve module may be your best option - this puts a pretty transparent interface between the dbm modules that makes them act like an arbitrary python dict, allowing any picklable value to be stored. Here's an example:

    # Create dict with a million items:
    import shelve
    d = shelve.open('path/to/my_persistant_dict')
    d.update(('key%d' % x, x) for x in xrange(1000000))
    d.close()
    

    Then in the next process, use it. There should be no large delay, as lookups are only performed for the key requested on the on-disk form, so everything doesn't have to get loaded into memory:

    >>> d = shelve.open('path/to/my_persistant_dict')
    >>> print d['key99999']
    99999
    

    It's a bit slower than a real dict, and it will still take a long time to load if you do something that requires all the keys (eg. try to print it), but may solve your problem.

提交回复
热议问题