ython 3.0, a.k.a Python 3000, a.k.a Python 3K, is out. It aims to fix the language design problems accumulated over the Python 2.x's lifetime, and to clean up the language and its standard library. But this upgrade is a bitter pill for developers to swallow, because Python 3.0 is
not backward compatible with Python 2.x. Python 3.0 even breaks the venerable "hello world!" program.
But there's some sugar to go with that medicine, too. Python 2 developers were not abandoned completely. Python 2.6 was released a few months prior to Python 3.0 and it aims to help in the migration process. For example, it includes a tool called
2to3 that assists in migrating Python 2.x code to Python 3.0. This article covers the major Python 3.0 language features, and future articles will cover:
- Additional changes to the language
- Changes to the standard library
- Python 2.6
- Migration from Python 2.x to Python 3.0
The Python Development Process
Python gets developed by an organized community effort led by BDFL (Benevolent Dictator for Life) Guido van Rossum, who hands down final judgments when the community can't reach a consensus. Possible Python changes are specified using PEPs (
Python Enhancement Proposals), and you can often find out more about a particular language feature or change by reading the appropriate PEP.
Python 3.0: A Bird's Eye View
Python 3.0 touches almost every aspect of the language. There are deep changes to the type system, classes, metaclasses, and abstract base classes. Exception handling has been cleaned up. Numeric data types have been souped up. Strings are now Unicode, and string formatting has been improved. But perhaps the first change you'll notice is that the
print statement is now a function.
I've started here because
print is the probably the first thing beginners encounter when learning Python. A "hello world" program in Python used to look like this:
>>> print 'hello world!'
hello world!
But if you run that using Python 3.0, it looks like this:
>>> print 'hello world!'
File "", line 1
print 'hello world!'
^
SyntaxError: invalid syntax
The right way to do it in Python 3.0 is to surround the string in parentheses:
>>> print('hello world!')
hello world!
Why did
print become a function? According to PEP-3105 (Make
print a function) there are five reasons:
PEP-3105 Reason 1: In 2.x,
print is the only application-level functionality that's a statement rather than a function. In the Python world, syntax is generally used as a last resort—only when it's impossible to accomplish some task without compiler assistance. And
print doesn't qualify for such an exception.
- Author's Response: That's true, but print is especially useful in interactive sessions where the extra two parentheses really get in the way.
PEP-3105 Reason 2: During application development it's quite common to replace print output with something more sophisticated, such as logging calls, or calls into some other I/O library. Using a
print() function, that substitution involves only a straightforward string replacement; but with a statement, you have to add the parentheses and possibly convert to
>>stream-style syntax.
- Author's Response: I don't agree with this argument. I don't think it's a problem to replace all these print statements. I have had the pleasure (or misfortune) of performing similar (and worse) changes across huge code bases several times. You must consider each case and decide whether you want the original simple behavior or the new fancy behavior. Replacing the print function with something else is a cool trick, but it also violates the "explicit is better than implicit" principle. When I see print or print>> today in a piece of code, I know exactly what to expect (assuming no one messed around with sys.stdout). But if print can be replaced easily, it could be pretty confusing to see no output because everything goes to some obscure log file.
PEP-3105 Reason 3: Having special syntax for
print puts up a much larger barrier for evolution. For example, it's not too far-fetched to consider a hypothetical new
printf() function, which would coexist with a
print() function.
- Author's Response: Changing such a basic feature of the language due to speculative future change in the language seems very inappropriate and is in opposition to Python's design philosophy.
PEP-3105 Reason 4: There's no easy way to convert
print statements into another call if one needs a separator other than spaces, or no separator. Also, there's no easy way to conveniently print objects with some other separator than a space.
- Author's Response: I don't agree with this argument either. If you need special formatting for objects, just format them into a string and print the string. Later, you'll see an example using both Python 2.X and Python 3.0—and you'll see that both look about the same and use clean code.
PEP-3105 Reason 5: It would be much easier to replace
print() as a function within just one module (
just def print(*args):...) or throughout a program (by putting a different function in
__builtin__.print). As it is, one can do this by writing a class with a
write() method and assigning that to
sys.stdout. That's not a bad solution, but it's definitely a much larger conceptual leap, and it works at a different level than
print.
- Author's Response: This argument is quite similar to reason #2, and I actually see it as a counter-argument. The fact that it is cumbersome to override print using sys.stdout makes doing so less prevalent, which is a good thing. In the rare cases when you actually need a print override (and simply using a different function won't do, such as for testing purposes) it is possible to override the original print.
That's quite a list of arguments, and I've included them because even this one issue should give you a sense of the deep controversies that Python 3.0 has—and will—engender. To me, overall, the benefits from the
print function seem marginal, and useful only to people who want to modify the way
print works. It also seems as if there are good enough ways to do that in Python 2.x, by using a separate function or manipulating
sys.stdout. The downside is more cumbersome syntax in interactive sessions (the
2to3 tool can take care of the problem during migration).
So although I don't like the
print function, it's here to stay. The greatest minds in the Python world have decreed that it should be so. Here's the signature:
def print(*args, sep=' ', end='\n', file=None)
The
*args are the arguments to be printed,
sep is the string that will be printed between
*args, and
end is the character to be printed after the last argument.
Sep and
end default to a space and
\n, respectively—which is exactly the default behavior of the Python 2.x
print statement. The file argument, if specified, sends the output to a provided file-like object (which should have a
write() method) that provides the functionality of
print>>.
Suppose you want to display addition exercises, for example:
2 + 4 + 7 = 13
In Python 2.x printing the exercise might look like this:
>>> print ' + '.join([repr(x) for x in numbers]) +
' = %d\n' % sum(numbers)
2 + 4 + 7 = 13
But in Python 3.0 it would look like this:
>>> numbers = [2, 4, 7]
>>> print(*numbers, sep=' + ', end=' =%d\n' % sum(numbers))
2 + 4 + 7 = 13
Both versions look a little on the cumbersome side. The following Python 3.0 code snippet wraps this logic in a function called
print_numbers(). It stores the original
print function in a variable called
original_print. The new
print_numbers() function relies on the
original_print function to actually print. After printing a couple of exercises it restores the original
print function.
original_print = print
def print_numbers(*numbers):
sep = ' + '
end = ' = %d\n' % sum(numbers)
original_print(*numbers, sep=sep, end=end)
# Make print_numbers the current print function
print = print_numbers
print(1, 2, 3)
print(3, 7)
# Restore the original print function
print = original_print
print(1, 2, 3)
Output:
1 + 2 + 3 = 6
3 + 7 = 10
1 2 3
After all my complaining about the
print function here's something nice: Because it's a function, you can assign it to a variable with a shorter name (e.g.
p) in your Python startup file. Then, in interactive sessions, the shorter name more than makes up for the parentheses:
>>> p = print
>>> p('Yeah, it works!')
Yeah, it works!
>>> p(5 + 9)
14
If you take this route, then you may want to use the
pprint function from the
pprint module rather than
print. The
pprint function can display nested Python data structures in a nice layout. The following code snippet prints a dictionary that contains several lists using both
print and
pprint so you can see the difference:
>>> r = list(range(10))
>>> d=dict(a=r, b=r, c=r)
>>> print(d)
{'a': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 'c': [
0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 'b': [
0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}
>>> from pprint import pprint as pp
>>> pp(d)
{'a': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
'b': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
'c': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}
>>>
Note that
pprint also sorted the dictionary items by key (as of Python 2.5). To write the preceding code, add the following statement to your Python startup file:
from pprint import pprint as pp
I find such shorthand constructs useful in my day-to-day work when exploring complex data structures.