问题
When import sub module from a package, the __init__.py file in the package folder will be exec first, how can I disable this. Sometimes I only need one function in a package, import the whole package is a little heavy.
For example the pandas.io.clipboard
module doesn't dependent on any other functions in pandas.
from pandas.io.clipboard import clipboard_get
will import the function, but also import all the pandas common modules. Are there some method that just import the clipboard module, as it's a module in my own application folder.
回答1:
No there isn't, by design. If you want to avoid much overhead when import sub-modules you just use empty __init__.py
s to define the packages. In this way the overhead of importing the package is practically zero.
If pandas
does not do that you have no way to import pandas.io.clipboard
without importing pandas
and io
first. What you can do, however it's a huge hack and it is not equivalent, is to import the clipboard
module as a normal module instead of as a sub-module. You simply have to find the location where pandas
is installed (e.g. /usr/lib/pythonX.Y/dist-packages/
) and insert the path of the parent package in the sys.path
(/usr/lib/pythonX.Y/dist-packages/pandas/io
in your case). Then you can import the clipboard
package by doing:
import clipboard
Note however that:
import clipboard
from pandas.io import clipboard as clipboard2
print(clipboard == clipboard2)
Will print False
. In fact doing this can break a lot of code, since you are fundamentally breaking some invariants that the import
mechanism assumes.
In particular if the sub-module does reference other sub-modules using relative imports the import will fail, and there are other situations where it will not behave correctly. An other example where this fails is if you have to deal with pickled objects. If you have some objects pickled using the module imported as pandas.io.clipboard
you will not be able to unpickle them using the module clipboard
imported as above.
In summary, don't! I suggest to either:
- Live with it if the time taken to import the package it's not a real issue.
- Or: Try to search for a replacement. If you need only
pandas.io.clipboard
but not the rest ofpandas
maybe you shouldn't usepandas
in the first place and you should use smaller package that implements only the functionality ofclipboard
.
If you look at pandas.util.clipboard source code you find out that it's actually just the pyperclip module version 1.7. You can just add this module in your site-packages
and use it instead of the one provided by pandas
. In fact the pandas
team only added the following piece at the end of the source code:
## pandas aliases
clipboard_get = paste
clipboard_set = copy
Expanding a bit about why python import works this way.
As you know in python modules are objects. And it also happens that packages are modules, although not every module is a package. When you import a package as in:
import pandas.io.clipboard
Python has to:
- Create the
module
instancepandas
- Create the
module
instanceio
and add it as attribute topandas
- Create the
module
instanceclipboard
and add it as attribute toio
.
In order to create a module
instance python must execute the code in the module.
The imports of the form:
from pandas.io import clipboard
Are just syntactic sugar for:
import pandas.io.clipboard
clipboard = pandas.io.clipboard
del pandas.io
Note that in the from
case clipboard
could be either a module
/package or simply something defined inside io
. In order to check for this the interpreter must also import io
and to do this it must also import pandas
.
回答2:
I found a method that use sys.meta_path
to hook the import process:
import imp, sys
class DummyLoader(object):
def load_module(self, name):
names = name.split("__")
path = None
for name in names:
f, path, info = imp.find_module(name, path)
path = [path]
return imp.load_module(name, f, path[0], info)
def find_module(self, name, path=None):
if "__" in name and not name.startswith("__"):
return DummyLoader()
else:
return None
if not sys.meta_path:
sys.meta_path.append(DummyLoader())
else:
sys.meta_path[0] = DummyLoader()
Use "__" instead of "." for load the file only:
import pandas__util__clipboard as clip
or use a function to load the file:
import imp
def load_module(name):
names = name.split(".")
path = None
for name in names:
f, path, info = imp.find_module(name, path)
path = [path]
return imp.load_module(name, f, path[0], info)
clip = load_module("pandas.util.clipboard")
回答3:
With Python3.5+, you can import a source file directly from its path without executing the __init__
in the directory containing the file. Inspired from the official importlib example:
import importlib
import pathlib
import sys
import types
def import_module_from_path(path) -> types.ModuleType:
"""Import a module from the given path."""
module_path = pathlib.Path(path).resolve()
module_name = module_path.stem # 'path/x.py' -> 'x'
spec = importlib.util.spec_from_file_location(module_name, module_path)
module = importlib.util.module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module)
return module
# Import `my_module` without executing `/path/to/__init__.py`
my_module = import_module_from_path('/path/to/my_module.py')
A few caveats:
sys.modules[module_name] = module
will overwrite the module with similar names. Futureimport module_name
call will returns this module.- The file is executed even if it is already imported.
回答4:
I tried these methods but couldn't get them to work. So apparently by design it is not supposed to work..
If you have to do this, create a new branch in the repo from where you are trying to import, or initialise a repo:
git checkout -b without_init
..then delete out __init__.py
!
From wherever you are trying to import you can check that Python is on the correct branch like this:
import subprocess
print ("Current branch is:", subprocess.check_output(["git rev-parse --abbrev-ref HEAD"], shell=True).strip().decode())
>> without_init
来源:https://stackoverflow.com/questions/21298833/how-to-only-import-sub-module-without-exec-init-py-in-the-package