Python thread for pre-importing modules

前端 未结 4 1544
抹茶落季
抹茶落季 2021-01-13 21:17

I am writing a Python application in the field of scientific computing. Currently, when the user works with the GUI and starts a new physics simulation, the interpreter imme

相关标签:
4条回答
  • 2021-01-13 22:00

    The problem with this is that the imports must still complete before they are usable. Depending on when they're first used, the application could still have to block for 10 seconds before it could start up anyway. Much more productive would be to profile the modules and figure out why they take so long to import.

    0 讨论(0)
  • 2021-01-13 22:02

    The general idea is good, but the Python/GUI session might not be all that responsive while the background thread is importing away; unfortunately, import inherently and inevitably "locks up" Python substantially (it's not just the GIL, there's specific extra locking for imports).

    Still worth trying, as it might make things a bit better -- it's also very easy, since Queues are intrinsically thread-safe and, besides a Queue's put and get, all you need is basically an __import__. Still, don't be surprised if this doesn't help enough and you still need extra oomph.

    If you have some drive that's intrinsically very fast, but with limited space, such as a "RAM drive" or a particularly snippy solid-state one, it may be worth keeping the needed packages in a .tar.bz2 (or other form of archive) and unpacking it onto the fast drive at program start (that's essentially just I/O and so it won't lock things up badly -- I/O operations rapidly release the GIL -- and also it's especially easy to delegate to a subprocess running tar xjf or the like).

    If some of the import slowness is due to a huge number of .py/.pyc/.pyo files, it's worth a try to keep those (in .pyc form only, not as .py) in a zipfile and importing from there (but that only helps with the I/O overhead, depending on your OS, filesystem, and drive: doesn't help with delays due to loading huge DLLs or executing initialization code in packages at load time, which I suspect are likelier culprits for the slowness).

    You could also consider splitting the application up with multiprocessing -- again using Queues (but of the multiprocessing kind) to communicate -- so that both imports and some heavy computations are delegated to a few auxiliary processes and thus made asynchronous (this may also help fully exploiting multiple cores at once). I suspect this may unfortunately be hard to arrange properly for visualization tasks (such as those you're presumably doing with mayavi) but it might help if you also have some "pure heavy computation" packages and tasks.

    0 讨论(0)
  • 2021-01-13 22:15

    Why not just do this when the app starts?

    def background_imports():
        import Traits
        import Mayavi
    
    thread = threading.Thread(target=background_imports)
    thread.setDaemon(True)
    thread.start()
    
    0 讨论(0)
  • 2021-01-13 22:21

    "the user works with the GUI and starts a new physics simulation"

    Not really clear. Does "works with the GUI" means double click? Double click what? Some wxWidgets GUI application? Or IDLE?

    If so, what does "starts a new physics simulation" mean? Click a button somewhere else? A GUI button to bring up a panel where they write code? Or do they import a script they wrote off line?

    Why is the import happening before the simulation starts? How long does a simulation take? What does the GUI show?

    I suspect that there's a way to be much, much lazier in doing the big imports. But from the description, it's hard to determine if there's a point in time where the import doesn't matter as much to the user.

    Threads don't help much. What helps is rethinking the UI experience.

    0 讨论(0)
提交回复
热议问题