I'm just starting a new Python project, and ideally I'd like to offer Python 2 and 3 support from the start, with minimal developmental overhead. My question is, what is the best way of doing this for brand new projects?
I have come across projects that run 2to3, or even 3to2, as part of their installation script. This seems to be a very common way. However, there seems to be several different ways of doing this. I also came across Distribute.
There is also the option of trying to write polyglot Python 2/Python 3 code. Even though this seems like a horrible idea, I have noticed that I tend to write code lately that is more idiomatic as Python 3 code, even though I still run it as Python 2. I have a feeling this only helps my own transition when the day finally arrives, and doesn't do much for offering or at least helping dual support though.
Most of the projects offering dual support that I have seen added Python 3 support late, so I'm especially curious if there is a better way that is more suited for new projects, where you have the benefit of a clean slate.
Thanks!
You should check out six, a library that provides a unified interface to various things that differ between Python 2 and 3.
In my experience, it depends on the kind of project.
If it is a library or very self contained application, a common choice is develop in Python 2.7 avoiding constructs deprecated in Python 3.x as much as possible and resort to automated tests to identify holes left by py2to3 that you will have to fix manually.
On the other side, for real life applications, be prepared to constantly stumble upon libraries that are not ported to py3k yet (sometimes important ones). Most of the time you will have no choice but port the library to Python 3, so if you can afford that, go for it. Usually I can't, that is why I'm not supporting Python 3 for this kind of project (but I struggle to write code that will be easier to port when opportune).
For unicode handling, I found this PyCon 2012 video very informative. The advice is good for both Python 2.x and 3.x: treat every string coming from outside as bytes and convert to unicode as soon as possible and output strings converting to bytes as late as possible. There is another very informative video about date/time handling.
[update]
This is an old answer. As of today (2019) there is no good rationale to start a project using Python 2.x and there are several compelling reasons to port older projects to Python 3.7+ and abandon support for Python 2.x.
In my experience it is better to not to use a library like six
; I instead have a single compat.py
for each package with just the needed code, not unlike Scott Griffiths's approach. six
has also the burden of trying to support long-gone Python versions: the truth is that life is much easier when you accept that Pythons <=2.6 and <=3.2 are gone. In 2.7 there are backported compatibility features such as .view*
methods on dict
s that work exactly like their non-prefixed versions on Python 3; and Python 3.3 on the other hand supports u
prefix on unicode strings again.
Even for very substantial packages, a compat.py
module, which allows other code to work unchanged, can be quite short: here's an example from the pika
package that my colleagues and I helped to make 2/3 polyglot. Pika is one of those projects that had really messed internals mixing unicode and 8 bit strings with each other, but now we've used it in production on Python 3 for well over 6 months without problems.
Other important thing is to always use the following __future__
s when developing:
from __future__ import absolute_import, division, print_function
I recommend against using unicode_literals
, because there are some strings that need to be of the type called str
on either platform. If you don't use unicode_literals
, you can do the following:
b'123'
is the 8-bit string literal'123'
is of the typestr
on both platformsu'123'
is proper unicode text on both platforms
In any case, please do not do 2to3 on installation/package build time; some packages used to do that in the past - pip install
ing those packages took some seconds on Python 2, but closer to minute on Python 3.
Pick 2 or 3, whichever is your favorite flavor, and make it work really well in that, with unit tests. Then make sure those tests work after running it through py2to3 or py3to2. Better to maintain one version of code.
My personal experience has been that it's easier to write code that works unchanged in both Python 2 and 3, rather than rely on 2to3/3to2 scripts which often can't quite get the translation right.
Maybe my situation is unusual as I'm doing lots with byte types and 2to3 has a hard task converting these, but the convenience of having one code base outweighs the nastiness of having a few hacks in the code.
As a concrete example, my bitstring module was an early convert to Python 3 and the same code is used for Python 2.6/2.7/3.x. The source is over 4000 lines of code and this is this bit I needed to get it to work for the different major versions:
# For Python 2.x/ 3.x coexistence
# Yes this is very very hacky.
try:
xrange
for i in range(256):
BYTE_REVERSAL_DICT[i] = chr(int("{0:08b}".format(i)[::-1], 2))
except NameError:
for i in range(256):
BYTE_REVERSAL_DICT[i] = bytes([int("{0:08b}".format(i)[::-1], 2)])
from io import IOBase as file
xrange = range
basestring = str
OK, that's not pretty, but it means that I can write 99% of the code in good Python 2 style and all the unit tests still pass for the same code in Python 3. This route isn't for everyone, but it is an option to consider.
If you need support for Python 2.5 or earlier, using Distribute and it's 2to3 integration is typically the best way to go. But if you only need to support Python 2.6 or later, I would make the code run under Python 2 and Python 3 without conversion. I would also use the six library to make this easier.
来源:https://stackoverflow.com/questions/11372190/python-2-and-python-3-dual-development