What happens when you import a package?

前端 未结 2 1627
南旧
南旧 2020-12-31 02:55

For efficiency\'s sake I am trying to figure out how python works with its heap of objects (and system of namespaces, but it is more or less clear). So, basically, I am tryi

相关标签:
2条回答
  • 2020-12-31 03:17

    You have a few different questions here. . .

    About importing packages

    When you import a package, the sequence of steps is the same as when you import a module. The only difference is that the packages's code (i.e., the code that creates the "module code object") is the code of the package's __init__.py.

    So yes, the sub-modules of the package are not loaded unless the __init__.py does so explicitly. If you do from package import module, only module is loaded, unless of course it imports other modules from the package.

    sys.modules names of modules loaded from packages

    When you import a module from a package, the name is that is added to sys.modules is the "qualified name" that specifies the module name together with the dot-separated names of any packages you imported it from. So if you do from package.subpackage import mod, what is added to sys.modules is "package.subpackage.mod".

    Importing only part of a module

    It is usually not a big concern to have to import the whole module instead of just one function. You say it is "painful" but in practice it almost never is.

    If, as you say, the functions have no external dependencies, then they are just pure Python and loading them will not take much time. Usually, if importing a module takes a long time, it's because it loads other modules, which means it does have external dependencies and you have to load the whole thing.

    If your module has expensive operations that happen on module import (i.e., they are global module-level code and not inside a function), but aren't essential for use of all functions in the module, then you could, if you like, redesign your module to defer that loading until later. That is, if your module does something like:

    def simpleFunction():
        pass
    
    # open files, read huge amounts of data, do slow stuff here
    

    you can change it to

    def simpleFunction():
        pass
    
    def loadData():
        # open files, read huge amounts of data, do slow stuff here
    

    and then tell people "call someModule.loadData() when you want to load the data". Or, as you suggested, you could put the expensive parts of the module into their own separate module within a package.

    I've never found it to be the case that importing a module caused a meaningful performance impact unless the module was already large enough that it could reasonably be broken down into smaller modules. Making tons of tiny modules that each contain one function is unlikely to gain you anything except maintenance headaches from having to keep track of all those files. Do you actually have a specific situation where this makes a difference for you?

    Also, regarding your last point, as far as I'm aware, the same all-or-nothing load strategy applies to C extension modules as for pure Python modules. Obviously, just like with Python modules, you could split things up into smaller extension modules, but you can't do from someExtensionModule import someFunction without also running the rest of the code that was packaged as part of that extension module.

    0 讨论(0)
  • 2020-12-31 03:18

    The approximate sequence of steps that occurs when a module is imported is as follows:

    1. Python tries to locate the module in sys.modules and does nothing else if it is found. Packages are keyed by their full name, so while pymodule is missing from sys.modules, pypacket.pymodule will be there (and can be obtained as sys.modules["pypacket.pymodule"].

    2. Python locates the file that implements the module. If the module is part of the package, as determined by the x.y syntax, it will look for directories named x that contain both an __init__.py and y.py (or further subpackages). The bottom-most file located will be either a .py file, a .pyc file, or a .so/.pyd file. If no file that fits the module is found, an ImportError will be raised.

    3. An empty module object is created, and the code in the module is executed with the module's __dict__ as the execution namespace.1

    4. The module object is placed in sys.modules, and injected into the importer's namespace.

    Step 3 is the point at which "objects get loaded into memory": the objects in question are the module object, and the contents of the namespace contained in its __dict__. This dict typically contains top-level functions and classes created as a side effect of executing all the def, class, and other top-level statements normally contained in each module.

    Note that the above only desribes the default implementation of import. There is a number of ways one can customize import behavior, for example by overriding the __import__ built-in or by implementing import hooks.


    1 If the module file is a .py source file, it will be compiled into memory first, and the code objects resulting from the compilation will be executed. If it is a .pyc, the code objects will be obtained by deserializing the file contents. If the module is a .so or a .pyd shared library, it will be loaded using the operating system's shared-library loading facility, and the init<module> C function will be invoked to initialize the module.

    0 讨论(0)
提交回复
热议问题