Consider the following Python project skeleton:
proj/
├── foo
│ └── __init__.py
├── README.md
└── scripts
└── run.py
In this case f
There are multiple ways to achieve this. Both require creating a python package by adding a setup.py (building on @matejcik's answer).
Option 1 (recommended): entry_point
+ console_scripts
register a function in your project as the entry point to script execution (ie: proj:foo:cli:run
).
Option 2: scripts
: Use this keyword argument in the setup()
method to reference the path to your script (ie: `bin/script.py).
I recommend using a CLI library/framework like Click so that your codebase is only concerned with maintaining application specific business logic rather than CLI robust framework feature logic. Also, click recommends using entry_point
+ console_scripts
method of script integration due to cross-platform compatibility.
Setup Tools - Automatic script creation: https://setuptools.readthedocs.io/en/latest/setuptools.html#automatic-script-creation
Setup Tools - keyword arguments: https://setuptools.readthedocs.io/en/latest/setuptools.html#new-and-changed-setup-keywords
Click GitHub: https://github.com/pallets/click/
Click Setuptools integration: https://click.palletsprojects.com/en/master/setuptools/
You need to add __init__.py
files to scripts
and to proj
folders for those to be considered Python packages and for you to be able to import from those.
One way this is also commonly done, is to place your foo
and scripts
folders into a proj/src
folder, which then has a __init__.py
file, and thus is a Python package.
Best practice? Put a single entry-point in the root
I know this might sound absurd, if you have lots of scripts you want to be able to execute... But it's actually the cleanest option and it's the one that is most often used in big Python projects like magage.py
in Django, for example. It also doesn't need to be a huge undertaking. Even more importantly, it is always more secure to have a single entry point than several smaller ones.
proj/
├── run.py
├── foo
│ └── __init__.py
├── README.md
└── scripts
└── my_script.py
When run.py
lives in the root directory, it can be very lightweight... Basically just a wrapper to call the function you need from my_scripts.py. It just ties everything together so now all of your imports just work.
Just keep in mind that your entrypoint is your root. The parent of a root doesn't exist. So put your entrypoint in the root, and then import packages relative to the root, aka import foo
from scripts
.
But how do I call multiple scripts!?
If you need to be able to call multiple scripts, this is a good argument for... Well... arguments! Keep run.py
as your single entrypoint/command, and leverage subcommands to pass functionality to the script you care about.
Reinventing the wheel?
Generally, frameworks have already done the architecture for you to add your own subcommands, such as Django and, for a smaller footprint, Flask.
You can easily wrap up a small project without that help, though, as I've illustrated.
Security
No one ever wishes their code was less refactorable after a few years of working with it. No one ever wishes their codebase has less security. As we drive to more secure systems in general, it would make sense to create some gatekeeper script that determines what is and isn’t a safe operation and by whom. Moving the code to an LDAP based system, and need to lock things down by group? No problem. You can either change the single file or add LDAP security in your codebase, even creating your own internal API.
With distributed scripts, security options are much less flexible and much harder to maintain, and a single vulnerability could leave you wide open to exploit.
Bonus advantage
You're adding abstraction to your script base. If you ever want to change the structure of your codebase (maybe you want scripts
to have subfolders with more organization), you/your users don't need to do any refactoring for any dependencies, or change paths to longer, more verbose names. Your package is self-contained, and the only thing a user will ever need to touch is your proj/run.py
entry-point.
And, obviously, you don't need to play with Python paths as much!
Python looks for packages/modules in the directories listed in sys.path
. There are several ways of ensuring that your directories of interest, in this case proj
, is one of those directories:
proj
directory. Python adds the directory containing the input script to sys.path
.proj
into the contents of the PYTHONPATH environment variable.proj
to sys.path
.Option 1 is the most logical and requires no source changes. If you are afraid that might break something, you can perhaps make scripts
a symbolic link pointing back to proj
?
If you are unwilling to do that, then ...
You may consider it a hack, but I would recommend that you do modify your scripts to update sys.path
at runtime. But instead append an absolute path so that the scripts can be executed regardless of what the current directory is. In your case, directory proj
is the parent directory of directory scripts
, where the scripts reside, so:
import sys
import os.path
parent_directory = os.path.split(os.path.dirname(__file__))[0]
if parent_directory not in sys.path:
#sys.path.insert(0, parent_directory) # the first entry is directory of the running script, so maybe insert after that at index 1
sys.append(parent_directory)
If you like simplicity, and there are no additional restrictions on what you asked, add one __init__.py
to the scripts
folder, and to any other sibling folders, making them packages, then always use the absolute import form, as you said you do not want proj
as a parent package of those and so there is no __init__.py
there, and then call your scripts (instead) from inside the proj
folder with:
python -m scripts.run
or whatever name you give to other scripts other than run.py
This is similar to option 2 of @matejcik answer, but even simpler.
another solution is you add a.pth file on your Python directory
and write the content of the following,
# your.pth
#↓ input the directory of proj
C:\...\proj
done
# scripts.py
from foo import Foo
Foo().run()
It will work well.
.. note:: If your IDE is PyCharm, then you can use the Source roots to help you too.