How to only import sub module without exec __init__.py in the package

天大地大妈咪最大 提交于 2021-01-21 03:46:31

问题


When import sub module from a package, the __init__.py file in the package folder will be exec first, how can I disable this. Sometimes I only need one function in a package, import the whole package is a little heavy.

For example the pandas.io.clipboard module doesn't dependent on any other functions in pandas.

from pandas.io.clipboard import clipboard_get will import the function, but also import all the pandas common modules. Are there some method that just import the clipboard module, as it's a module in my own application folder.


回答1:


No there isn't, by design. If you want to avoid much overhead when import sub-modules you just use empty __init__.pys to define the packages. In this way the overhead of importing the package is practically zero.

If pandas does not do that you have no way to import pandas.io.clipboard without importing pandas and io first. What you can do, however it's a huge hack and it is not equivalent, is to import the clipboard module as a normal module instead of as a sub-module. You simply have to find the location where pandas is installed (e.g. /usr/lib/pythonX.Y/dist-packages/) and insert the path of the parent package in the sys.path (/usr/lib/pythonX.Y/dist-packages/pandas/io in your case). Then you can import the clipboard package by doing:

import clipboard

Note however that:

import clipboard
from pandas.io import clipboard as clipboard2
print(clipboard == clipboard2)

Will print False. In fact doing this can break a lot of code, since you are fundamentally breaking some invariants that the import mechanism assumes.

In particular if the sub-module does reference other sub-modules using relative imports the import will fail, and there are other situations where it will not behave correctly. An other example where this fails is if you have to deal with pickled objects. If you have some objects pickled using the module imported as pandas.io.clipboard you will not be able to unpickle them using the module clipboard imported as above.

In summary, don't! I suggest to either:

  • Live with it if the time taken to import the package it's not a real issue.
  • Or: Try to search for a replacement. If you need only pandas.io.clipboard but not the rest of pandas maybe you shouldn't use pandas in the first place and you should use smaller package that implements only the functionality of clipboard.

If you look at pandas.util.clipboard source code you find out that it's actually just the pyperclip module version 1.7. You can just add this module in your site-packages and use it instead of the one provided by pandas. In fact the pandas team only added the following piece at the end of the source code:

## pandas aliases
clipboard_get = paste
clipboard_set = copy

Expanding a bit about why python import works this way.

As you know in python modules are objects. And it also happens that packages are modules, although not every module is a package. When you import a package as in:

import pandas.io.clipboard

Python has to:

  1. Create the module instance pandas
  2. Create the module instance io and add it as attribute to pandas
  3. Create the module instance clipboard and add it as attribute to io.

In order to create a module instance python must execute the code in the module.

The imports of the form:

from pandas.io import clipboard

Are just syntactic sugar for:

import pandas.io.clipboard
clipboard = pandas.io.clipboard
del pandas.io

Note that in the from case clipboard could be either a module/package or simply something defined inside io. In order to check for this the interpreter must also import io and to do this it must also import pandas.




回答2:


I found a method that use sys.meta_path to hook the import process:

import imp, sys

class DummyLoader(object):

    def load_module(self, name):
        names = name.split("__")
        path = None
        for name in names:
            f, path, info = imp.find_module(name, path)
            path = [path]
        return imp.load_module(name, f, path[0], info)

    def find_module(self, name, path=None):
        if "__" in name and not name.startswith("__"):
            return DummyLoader()
        else:
            return None

if not sys.meta_path:
    sys.meta_path.append(DummyLoader())
else:
    sys.meta_path[0] = DummyLoader()

Use "__" instead of "." for load the file only:

import pandas__util__clipboard as clip

or use a function to load the file:

import imp

def load_module(name):
    names = name.split(".")
    path = None
    for name in names:
        f, path, info = imp.find_module(name, path)
        path = [path]
    return imp.load_module(name, f, path[0], info)    

clip = load_module("pandas.util.clipboard")



回答3:


With Python3.5+, you can import a source file directly from its path without executing the __init__ in the directory containing the file. Inspired from the official importlib example:

import importlib
import pathlib
import sys
import types


def import_module_from_path(path) -> types.ModuleType:
  """Import a module from the given path."""
  module_path = pathlib.Path(path).resolve()
  module_name = module_path.stem  # 'path/x.py' -> 'x'
  spec = importlib.util.spec_from_file_location(module_name, module_path)
  module = importlib.util.module_from_spec(spec)
  sys.modules[module_name] = module
  spec.loader.exec_module(module)
  return module


# Import `my_module` without executing `/path/to/__init__.py`
my_module = import_module_from_path('/path/to/my_module.py')

A few caveats:

  • sys.modules[module_name] = module will overwrite the module with similar names. Future import module_name call will returns this module.
  • The file is executed even if it is already imported.



回答4:


I tried these methods but couldn't get them to work. So apparently by design it is not supposed to work..

If you have to do this, create a new branch in the repo from where you are trying to import, or initialise a repo:

git checkout -b without_init

..then delete out __init__.py !

From wherever you are trying to import you can check that Python is on the correct branch like this:

import subprocess
print ("Current branch is:", subprocess.check_output(["git rev-parse --abbrev-ref HEAD"], shell=True).strip().decode())

>> without_init


来源:https://stackoverflow.com/questions/21298833/how-to-only-import-sub-module-without-exec-init-py-in-the-package

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!