Why would you compile a Python script? You can run them directly from the .py file and it works fine, so is there a performance advantage or something?
I also notic
There's certainly a performance difference when running a compiled script. If you run normal .py
scripts, the machine compiles it every time it is run and this takes time. On modern machines this is hardly noticeable but as the script grows it may become more of an issue.
The .pyc file is Python that has already been compiled to byte-code. Python automatically runs a .pyc file if it finds one with the same name as a .py file you invoke.
"An Introduction to Python" says this about compiled Python files:
A program doesn't run any faster when it is read from a ‘.pyc’ or ‘.pyo’ file than when it is read from a ‘.py’ file; the only thing that's faster about ‘.pyc’ or ‘.pyo’ files is the speed with which they are loaded.
The advantage of running a .pyc file is that Python doesn't have to incur the overhead of compiling it before running it. Since Python would compile to byte-code before running a .py file anyway, there shouldn't be any performance improvement aside from that.
How much improvement can you get from using compiled .pyc files? That depends on what the script does. For a very brief script that simply prints "Hello World," compiling could constitute a large percentage of the total startup-and-run time. But the cost of compiling a script relative to the total run time diminishes for longer-running scripts.
The script you name on the command-line is never saved to a .pyc file. Only modules loaded by that "main" script are saved in that way.
There is a performance increase in running compiled python. However when you run a .py file as an imported module, python will compile and store it, and as long as the .py file does not change it will always use the compiled version.
With any interpeted language when the file is used the process looks something like this:
1. File is processed by the interpeter.
2. File is compiled
3. Compiled code is executed.
obviously by using pre-compiled code you can eliminate step 2, this applies python, PHP and others.
Heres an interesting blog post explaining the differences http://julipedia.blogspot.com/2004/07/compiled-vs-interpreted-languages.html
And here's an entry that explains the Python compile process http://effbot.org/zone/python-compile.htm
We use compiled code to distribute to users who do not have access to the source code. Basically to stop inexperienced programers accidentally changing something or fixing bugs without telling us.
Pluses:
First: mild, defeatable obfuscation.
Second: if compilation results in a significantly smaller file, you will get faster load times. Nice for the web.
Third: Python can skip the compilation step. Faster at intial load. Nice for the CPU and the web.
Fourth: the more you comment, the smaller the .pyc
or .pyo
file will be in comparison to the source .py
file.
Fifth: an end user with only a .pyc
or .pyo
file in hand is much less likely to present you with a bug they caused by an un-reverted change they forgot to tell you about.
Sixth: if you're aiming at an embedded system, obtaining a smaller size file to embed may represent a significant plus, and the architecture is stable so drawback one, detailed below, does not come into play.
Top level compilation
It is useful to know that you can compile a top level python source file into a .pyc
file this way:
python -m py_compile myscript.py
This removes comments. It leaves docstrings
intact. If you'd like to get rid of the docstrings
as well (you might want to seriously think about why you're doing that) then compile this way instead...
python -OO -m py_compile myscript.py
...and you'll get a .pyo
file instead of a .pyc
file; equally distributable in terms of the code's essential functionality, but smaller by the size of the stripped-out docstrings
(and less easily understood for subsequent employment if it had decent docstrings
in the first place). But see drawback three, below.
Note that python uses the .py
file's date, if it is present, to decide whether it should execute the .py
file as opposed to the .pyc
or .pyo
file --- so edit your .py file, and the .pyc
or .pyo
is obsolete and whatever benefits you gained are lost. You need to recompile it in order to get the .pyc
or .pyo
benefits back again again, such as they may be.
Drawbacks:
First: There's a "magic cookie" in .pyc
and .pyo
files that indicates the system architecture that the python file was compiled in. If you distribute one of these files into an environment of a different type, it will break. If you distribute the .pyc
or .pyo
without the associated .py
to recompile or touch
so it supersedes the .pyc
or .pyo
, the end user can't fix it, either.
Second: If docstrings
are skipped with the use of the -OO
command line option as described above, no one will be able to get at that information, which can make use of the code more difficult (or impossible.)
Third: Python's -OO
option also implements some optimizations as per the -O
command line option; this may result in changes in operation. Known optimizations are:
sys.flags.optimize
= 1assert
statements are skipped__debug__
= FalseFourth: if you had intentionally made your python script executable with something on the order of #!/usr/bin/python
on the first line, this is stripped out in .pyc
and .pyo
files and that functionality is lost.
Fifth: somewhat obvious, but if you compile your code, not only can its use be impacted, but the potential for others to learn from your work is reduced, often severely.
Yep, performance is the main reason and, as far as I know, the only reason.
If some of your files aren't getting compiled, maybe Python isn't able to write to the .pyc file, perhaps because of the directory permissions or something. Or perhaps the uncompiled files just aren't ever getting loaded... (scripts/modules only get compiled when they first get loaded)