Advanced Python

Carl Meyer

This talk

How

Docs links on final slide. No stress; doc links will be live in online slides.

All code is Py3, but will note Py2 differences.

Me

Decorators

Python functions are first class:

 1 >>> def say_hi():
 2 ...     print("Hi!")
 3 ...
 4 
 5 >>> def call_twice(func):
 6 ...     func()
 7 ...     func()
 8 ...
 9 
10 >>> call_twice(say_hi)
11 Hi!
12 Hi!

We can pass a function around like any other object.

Pass it as an argument to another function (no parens is reference, parents means a call).

Then call it by another name, twice (now parens!)

Decorator

A function that takes a function as an argument, and returns a function.

 1 >>> def noisy(func):
 2 ...    def decorated():
 3 ...        print("Before")
 4 ...        func()
 5 ...        print("After")
 6 ...    return decorated
 7 
 8 >>> say_hi_noisy = noisy(say_hi)
 9 
10 >>> say_hi_noisy()
11 Before
12 Hi!
13 After

We pass in say_hi to noisy, and get back the function "decorated"; when we call it, we get the Before, then the function we passed in (say_hi) is called, then we get After.

The function "decorated" is a closure; it "closes over" the value of the variable "func" in its containing scope.

Decorator syntax

In place of:

1 def say_hi():
2     print("Hi!")
3 
4 say_hi = noisy(say_hi)

we can write:

1 @noisy
2 def say_hi():
3     print("Hi!")

If we don't need the original (undecorated) function.

Either way:

1 >>> say_hi()
2 Before
3 Hi!
4 After

Let's try another:

>>> @noisy
... def square(x):
...     return x * x
...

>>> square(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: decorated() takes 0 positional arguments but
           1 was given

Oops!

The cause

1 def noisy(func):
2     def decorated():
3         print("Before")
4         func()
5         print("After")
6     return decorated

Our wrapper decorated function takes no arguments, and passes none on to the wrapped function.

So it can only wrap functions that require no arguments.

The fix: *args and **kwargs

to write decorators that can wrap any function signature:

1 def noisy(func):
2     def decorated(*args, **kwargs):
3         print("Before")
4         func(*args, **kwargs)
5         print("After")
6     return decorated

Depends on the type of decorators. Some decorators might look at or even change the arguments, so this total flexibility wouldn't work.

A real example

 1 def login_required(view_func):
 2     @wraps(view_func)
 3     def decorated(request, *args, **kwargs):
 4         if not request.user.is_authenticated():
 5             return redirect('/login/')
 6         return view_func(request, *args, **kwargs)
 7     return decorated
 8 
 9 @login_required
10 def edit_profile(request):
11     pass # ...

Simplified from the actual implementation.

Cautions

Further exploration

Context managers

1 with open('somefile.txt', 'w') as fh:
2     fh.write('contents\n')

Opens the file, then executes the block, then closes the file.

Can replace try/finally

In place of:

fh = open('somefile.txt', 'w')
try:
    fh.write('contents\n')
finally:
    fh.close()

we can write:

with open('somefile.txt', 'w') as fh:
    fh.write('contents\n')

More concise syntax for resource management / cleanup.

Writing a context manager

If open weren't already a context manager, we might write one:

 1 class MyOpen:
 2     def __init__(self, filename, mode='r'):
 3         self.filename = filename
 4         self.mode = mode
 5 
 6     def __enter__(self):
 7         self.fh = open(self.filename, self.mode)
 8         return self.fh
 9 
10     def __exit__(self, exc_type, exc_value, traceback):
11         self.fh.close()
12 
13 
14 with MyOpen('somefile.txt', 'w') as fh:
15     fh.write('contents\n')

open already can act like a context manager. But if not, here's a simplified example of how we could implement it.

Just any object with __enter__ and __exit__ methods.

Return value of __enter__ accessible via as keyword.

Exception handling

 1  class NoisyCM:
 2      def __enter__(self):
 3          print("Entering!")
 4 
 5      def __exit__(self, exc_type, exc_value, traceback):
 6          print("Exiting!")
 7          if exc_type is not None:
 8              print("Caught {}".format(exc_type.__name__))
 9              return True
1 >>> with NoisyCM():
2 ...     print("Inside!")
3 ...     raise ValueError
4 Entering!
5 Inside!
6 Exiting!
7 Caught ValueError

__exit__ gives us info on any exception raised inside the with block

Can return True to suppress it, else it will propagate.

Convenience method

 1 from contextlib import contextmanager
 2 
 3 @contextmanager
 4 def my_open(filename, mode='r'):
 5     fh = open(filename, mode)
 6     try:
 7         yield fh
 8     finally:
 9         fh.close()
10 
11 
12 with my_open('somefile.txt', 'w') as fh:
13     fh.write('contents\n')

When even a class with two methods is too much boilerplate, contextmanager streamlines it.

Uses a decorator! Also a generator (yield statement); we'll see that soon.

Yielded value goes to 'as' clause; after the block, resumes after the yield.

If we want unconditional cleanup we still need to use a try/finally.

Example: transaction API

1 from django.db import transaction
2 
3 with transaction.atomic():
4     write_to_the_database()
5     write_to_the_database_some_more()

Opens a database transaction on enter, commits it on exit (or rolls it back if there was an exception).

Example: test assertion

1 import pytest
2 
3 def test_cannot_divide_by_zero():
4     with pytest.raises(ZeroDivisionError):
5        1 / 0

Cautions

Descriptors

Attributes are simple:

 1 >>> class Person:
 2 ...     def __init__(self, name):
 3 ...         self.name = name
 4 
 5 >>> p = Person(name="Arthur Belling")
 6 
 7 >>> p.name
 8 'Arthur Belling'
 9 
10 >>> p.name = "Arthur Nudge"
11 
12 >>> p.name
13 'Arthur Nudge'
14 
15 >>> del p.name
16 
17 >>> p.name
18 Traceback (most recent call last):
19 ...
20 AttributeError: 'Person' object has no attribute 'name'

We can get them, set them, and delete them.

Python is not Java

 1 class NoisyDescriptor:
 2     def __get__(self, obj, objtype):
 3          print("Getting")
 4          return obj._val
 5 
 6     def __set__(self, obj, val):
 7          print("Setting to {}".format(val))
 8          obj._val = val
 9 
10     def __delete__(self, obj):
11          print("Deleting")
12          del obj._val

Still need to store underlying data somewhere. Here we use "_val" (private, not enforced)

Only one instance of this decorator can be used per-class w/out sharing data.

Could pass in a name, generate one, use a metaclass...

 1 >>> class Person:
 2 ...     name = NoisyDescriptor()
 3 
 4 >>> luigi = Person()
 5 
 6 >>> luigi.name = "Luigi"
 7 Setting to Luigi
 8 
 9 >>> luigi._val
10 'Luigi'
11 
12 >>> luigi.name
13 Getting
14 'Luigi'
15 
16 >>> del luigi.name
17 Deleting

We set the descriptor as a class attribute.

Then when we get, or set, or delete the name attribute of an instance of that class, it goes through the descriptor's methods.

Head asplode

calculated property

 1 class Person:
 2     def __init__(self, first_name, last_name):
 3         self.first_name = first_name
 4         self.last_name = last_name
 5 
 6     @property
 7     def full_name(self):
 8         return "{} {}".format(
 9             self.first_name, self.last_name)
1 >>> p = Person("Eric", "Praline")
2 
3 >>> p.full_name
4 'Eric Praline'
5 
6 >>> p.full_name = "John Cleese"
7 Traceback (most recent call last):
8 AttributeError: can't set attribute

Use the built-in 'property' decorator to turn a method into a descriptor with __get__.

Note we access it as an attribute; from the outside there is no clue that it isn't an ordinary attribute.

Until we try to set it, that is - it's read-only.

boolean-only attribute

 1 class User:
 2     @property
 3     def is_admin(self):
 4         return self._is_admin
 5 
 6     @is_admin.setter
 7     def is_admin(self, val):
 8         if val not in {True, False}:
 9             raise ValueError(
10                 'is_admin must be True or False')
11         self._is_admin = val
1 >>> u = User()
2 
3 >>> u.is_admin = True
4 
5 >>> u.is_admin = 'foo'
6 Traceback (most recent call last):
7 ValueError: is_admin must be True or False

Define the getter same as before; internally we are using "_is_admin" to store the value.

Then it gets interesting:

  • property turns is_admin into a descriptor.
  • The descriptor has a setter method, which is a decorator.
  • We use that decorator to define a setter for this property.

In our setter we check to ensure the value is boolean, and if so, set it.

If not, raise a ValueError.

(deleter is also available.)

Descriptors & properties

Iterables, iterators, & generators, oh my!

Iteration is simple.

1 >>> numbers = [1, 2, 3]
2 
3 >>> for num in numbers:
4 ...     print(num)
5 1
6 2
7 3

We can make a list, and then use for ... in ... to iterate over that list.

What is iterable?

The term for objects that we can iterate over is "iterable".

Many built-in types are iterable: list, set, tuple, dict...

Any object can be iterable; it just needs an __iter__ method.

Which must return an iterator.

Which of course raises the question...

Ok, what's an iterator?

An aside: magic methods

an iterator sighting!

 1 >>> numbers = [1, 2, 3]
 2 
 3 >>> iterator = iter(numbers)
 4 
 5 >>> iterator
 6 <list_iterator object at 0x...>
 7 
 8 >>> next(iterator)
 9 1
10 
11 >>> next(iterator)
12 2
13 
14 >>> next(iterator)
15 3
16 
17 >>> next(iterator)
18 Traceback (most recent call last):
19 StopIteration

We can get an iterator for a list, and then keep calling next() on it and getting the next item in the list, until finally it raises StopIteration.

Wondering why you don't see StopIteration all over the place? The for loop (and other kinds of built-in iteration, such as comprehensions) catch it for you; that's how they know when iteration is done.

The true story of a for loop

What really happens when we for x in numbers: print(x):

1 iterator = iter(numbers)
2 while True:
3     try:
4         x = next(iterator)
5     except StopIteration:
6         break
7     print(x)

Get an iterator, keep calling next() on that iterator until it raises StopIteration.

Iterator independence

 1 >>> numbers = [1, 2]
 2 
 3 >>> iter1 = iter(numbers)
 4 
 5 >>> iter2 = iter(numbers)
 6 
 7 >>> next(iter1)
 8 1
 9 
10 >>> next(iter2)
11 1
12 
13 >>> for x in numbers:
14 ...     for y in numbers:
15 ...         print(x, y)
16 1 1
17 1 2
18 2 1
19 2 2

We can get two different iterators for the same underlying list, and they each maintain their own separate iteration state.

This is why you can do nested for loops over the same list, and they don't interfere with each other.

iterators are iterable

Iterators should define an __iter__() method that returns self.

This means an iterator is also iterable (but one-shot).

 1 >>> numbers = [1, 2, 3]
 2 
 3 >>> iterator = iter(numbers)
 4 
 5 >>> for num in iterator:
 6 ...     print(num)
 7 1
 8 2
 9 3
10 
11 >>> for num in iterator:
12 ...     print(num)

Also, because iterators are one-shot, you can't do nested loops over the same iterator like you can with a list (whose __iter__() returns a new iterator each time).

Let's try writing our own

A fibonacci iterator

 1 class Fibonacci:
 2     def __init__(self):
 3         self.last = 0
 4         self.curr = 1
 5 
 6     def __next__(self):
 7         self.last, self.curr = (
 8             self.curr, self.last + self.curr)
 9         return self.last
10 
11     def __iter__(self):
12         return self
1 >>> f = Fibonacci()
2 
3 >>> print(next(f), next(f), next(f), next(f), next(f))
4 1 1 2 3 5

Fibonacci is always used as an example of recursion -- we're going to use it as a demonstration of iteration instead.

We define a __next__() method (makes it an iterator) and an __iter__() method that returns itself (so its iterable; we can use it in a for loop.

But I don't use it in a for loop. Why? Note we never raise StopIteration from next(); this is an infinite iterator!

itertools: iterator plumbing

1 >>> from itertools import takewhile
2 
3 >>> fib = takewhile(lambda x: x < 100000, Fibonacci())
4 
5 >>> multiple_of_7 = filter(lambda x: not x % 7, fib)
6 
7 >>> list(multiple_of_7)
8 [21, 987, 46368]

The itertools module contains a bunch of "pipes" you can connect together to do interesting things with iterators.

Just one quick example - check out the docs for lots more!

We use takewhile to limit the infinite Fibonacci iterator to just elements under 100,000.

Then we use filter to filter it down to just those that are divisible by 7.

This processes only one element at a time, so we won't exhaust memory no matter how high we go.

Generators

1 def toygen():
2     print("Starting function body.")
3     yield 1
4     print("Between yields.")
5     yield 2
 1 >>> gen = toygen()
 2 
 3 >>> gen
 4 <generator object toygen at 0x...>
 5 
 6 >>> next(gen)
 7 Starting function body.
 8 1
 9 
10 >>> next(gen)
11 Between yields.
12 2
13 
14 >>> next(gen)
15 Traceback (most recent call last):
16 StopIteration

Fibonacci generator

1 def fibonacci():
2     last, curr = 0, 1
3     while True:
4         last, curr = curr, curr + last
5         yield last
1 >>> fib = fibonacci()
2 
3 >>> fib
4 <generator object fibonacci at 0x...>
5 
6 >>> list(itertools.takewhile(lambda x: x < 20, fib))
7 [1, 1, 2, 3, 5, 8, 13]

The generator implementation is clearly shorter than the iterator class we wrote before; a simple function instead of a class with multiple methods.

Re-implementing itertools.takewhile

1 def my_takewhile(predicate, iterator):
2     for elem in iterator:
3         if not predicate(elem):
4             break
5         yield elem

takewhile can be easily implemented as a generator.

Just loop over the items in the incoming iterator, yielding them one at a time, and breaking out of the loop the first time we hit an element that fails the predicate test.

generator expressions

A list comprehension is a concise expression to build/transform/filter a list:

1 >>> numbers = [1, 2, 3]
2 
3 >>> [n*2 for n in numbers]
4 [2, 4, 6]
5 
6 >>> [n for n in numbers if n % 2]
7 [1, 3]

Replace the brackets with parens, and you have a generator expression:

1 >>> odd_fib = (n for n in fibonacci() if n % 2)
2 
3 >>> doubled_fib = (n*2 for n in fibonacci())

A generator expression is a very concise way to transform each element in an iterator, and/or filter an iterator. (Can replace the filter built-in, as we see here).

__iter__() as a generator

1 class ErrorList:
2     def __init__(self):
3         self.errors = []
4 
5     def __iter__(self):
6         for error in self.errors:
7             yield error

or, even shorter:

1 class ErrorList:
2     def __init__(self):
3         self.errors = []
4 
5     def __iter__(self):
6         return iter(self.errors)

Iterators & generators

Metaclasses

“Metaclasses are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don't.”

Source: Tim Peters, comp.lang.python

This quote is basically obligatory at this point in any discussion of Python metaclasses.

Because of that, and because it's just too much to cover, we'll leave it there - metaclasses will go on the "further exploration" list.

Review

Questions?

Carl Meyer

SpaceForward
Left, Down, Page DownNext slide
Right, Up, Page UpPrevious slide
POpen presenter console
HToggle this help