Special Methods and Protocols
Special methods – also called magic methods – are the secret sauce to Python’s duck typing.
Defining the appropriate special methods in your classes is how you make your class act like the standard classes.
What’s in a Name?
We’ve seen at least one special method so far:
__init__
It’s all in the double underscores which are pronounced “dunder” for “double underscore”.
To see all of the dunder methods try running dir(2)
or dir(list)
Generally Useful Special Methods
Most classes should at least have these special methods:
object.__str__
:
Called by the str() built-in function and by the print function to compute the informal string representation of an object.
object.__repr__
:
Called by the
repr()
built-in function to compute the official string representation of an object.Ideally:
eval(repr(something)) == something
This means that the “repr” is what you type to create the object. In practice, this is impractical for complex objects but it is still a more “formal” form.
Note that if you don’t define a
__str__
method, then the__repr__
will be used. And the base class (object
) has a__repr__
defined, so every class automatically gets one – but it’s ugly and not very useful.
Protocols
The set of special methods needed to emulate a particular type of Python object is called a protocol.
Your classes can “become” like Python built-in classes by implementing the methods in a given protocol.
Remember, these are more guidelines than laws. Implement only what you need.
The Numerics Protocol
Do you want your class to behave like a number? Implement these methods:
object.__add__(self, other)
object.__sub__(self, other)
object.__mul__(self, other)
object.__matmul__(self, other)
object.__truediv__(self, other)
object.__floordiv__(self, other)
object.__mod__(self, other)
object.__divmod__(self, other)
object.__pow__(self, other[, modulo])
object.__lshift__(self, other)
object.__rshift__(self, other)
object.__and__(self, other)
object.__xor__(self, other)
object.__or__(self, other)
Or just implement the fraction you actually need.
Operator Overloading
Most of the previous examples map to “operators”: +
, -
, *
, //
, /
, %
etc. This is often known as “operator overloading”, as you are redefining what the operators mean for that specific type.
Note that you can define these operators to do ANYTHING you want – but it is a really good idea to only define them to do something that makes sense in the usual way. I mean you could implement __add__
to subtract and __sub__
to add but everyone will be confused.
One interesting exception to this rule is the pathlib.Path
class, that has defined __truediv__
to mean path concatenation:
In [19]: import pathlib
In [20]: p1 = pathlib.Path.cwd()
In [21]: p1
Out[21]: PosixPath('/Users/Chris/PythonStuff/UWPCE/PythonCertDevel')
In [22]: p1 / "a_filename"
Out[22]: PosixPath('/Users/Chris/PythonStuff/UWPCE/PythonCertDevel/a_filename')
While this is not division in any sense, the slash is used as a path separator – so this does make some intuitive sense.
Comparing
If you want your objects to be comparable:
A > B
A < B
A >= B
etc.
There is a full set of magic methods you can use to override the “comparison operators”:
__lt__ : < (less than)
__le__ : <= (less than or equal)
__eq__ : == (equal)
__ge__ : >= (greater than or equal)
__gt__ : > (greater than)
__ne__ : != (not equal)
These are known as the “rich comparison” operators, as they allow fuller featured comparisons. In particular, they are used by numpy to provide “element-wise” comparison – that is, comparing two arrays yields an array of results, rather than a single result:
In [26]: import numpy as np
In [27]: arr1 = np.array([3,4,5,6,7,8,9])
In [28]: arr2 = np.array([9,2,6,2,6,3,9])
In [29]: arr1 > arr2
Out[29]: array([False, True, False, True, True, True, False], dtype=bool)
In [30]: arr1 == arr2
Out[30]: array([False, False, False, False, False, False, True], dtype=bool)
This is just one example – the point is that for your particular class, you can define these comparisons however you want.
Total Ordering
You may notice that those operators are kind of redundant – if A > B is True
then we know that A < B is False
and A <= B is False
.
In fact, there is a mathematical / computer science concept known as “Total Order”: (https://en.wikipedia.org/wiki/Total_order), which strictly defines “well behaved” objects in this regard.
There may be some special cases, where these rules may not apply for your classes (though I can’t think of any), but for the most part, you want your classes, if they support comparisons at all, to be well behaved, or “total ordered”.
Because this is the common case, Python comes with a nifty utility that implements total ordering for you: https://docs.python.org/3/library/functools.html#functools.total_ordering
It can be found in the functools module, and it allows you to specify __eq__
and only one of: __lt__()
, __le__()
, __gt__()
, or __ge__()
. It will then fill in the others for you.
Note: if you define only one, it should be __lt__
, because this is the one used for sorting. See below for more about that.
Here is the truncated example from the docs:
@total_ordering
class Student:
def __eq__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) ==
(other.lastname.lower(), other.firstname.lower()))
def __lt__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) <
(other.lastname.lower(), other.firstname.lower()))
Note that this makes it a lot easier than implementing all six comparison operators. However, if you read the doc, it lets you know that total_ordering
has poor performance – it is doing extra method call re-direction when the operators are used. If performance matters to your use case – and it probably doesn’t – then you need to write all six comparison dunders.
Sorting
Python has a handful of sorting methods built in:
list.sort()
– for sorting a list in placesorted(iterable)
– for creating a sorted copy of an iterable (sequence)
Plus there are a couple of more obscure ones.
In order for your custom objects to be sortable, they need the __lt__
(less than) magic method defined – that’s about it.
So if you are using the total_ordering
decorator, it’s best to define __eq__
and __lt__
– that way sorting will be able to use a “native” method for sorting, and maybe get better performance.
Sort Key Methods
By default, the sorting methods use __lt__
for comparison, and that algorithm calls __lt__
O(n log(n)) times. But if you pass a “key” function in to the sort call:
a_list.sort(key=key_func)
Then the key_func
is only called n
times. And if the key returns a simple type, like an integer or float, then the sorting will be faster.
So it often helps to provide a sort_key()
method on your class, so it can be passed in to the sort methods, like this:
class MySimpleObject:
"""
simple class to demonstrate a simple sorting key method
"""
def __init__(self, val):
self.val = val
def sort_key(self):
return self.val
And then you can use it like this:
list_of_simple_objects.sort(key=MySimpleObject.sort_key)
See: sort_key.py
for a complete example with timing. Here is an example of running it:
Timing for 10000 items
regular sort took: 0.04288s
key sort took: 0.004779s
performance improvement factor: 8.9726
So it is almost 9 times faster for a 10,000 item list. Pretty good, eh?
An Example
Each of these methods supports a common Python operation.
For example, to make ‘+’ work with a sequence type in a vector-like fashion, implement __add__
:
def __add__(self, v):
"""return the element-wise vector sum of self and v
"""
assert len(self) == len(v)
return vector([x1 + x2 for x1, x2 in zip(self, v)])
A slightly more complete example may be seen here vector.py
.
Emulating Standard Types
You can making your classes behave like the built-ins.
The Container Protocol
Do you want to make a container type? Here’s what you need:
object.__len__(self)
object.__getitem__(self, key)
object.__setitem__(self, key, value)
object.__delitem__(self, key)
object.__iter__(self)
object.__reversed__(self)
object.__contains__(self, item)
object.__index__(self)
__len__
is called when len(object)
is called.
__reversed__
is called when reversed(object)
is called.
__contains__
is called with in
is used, e.g. something in object
.
__iter__
is used for iteration, i.e. when in a for
loop.
__index__
is used to convert the object into an integer for indexing. So you don’t define this in a container type but rather define it for a type so it can be used as an index. If you have a class that could reasonably be interpreted as an index, you should define this. It should return an integer. This was added to support multiple integer types for numpy.
Indexing and Slicing
__getitem__
and __setitem__
are used when indexing. For example, x = object[i]
calls __getitem__
, and object[i] = something
calls __setitem__
But indexing is pretty complex in python. There is simple indexing: object[i]
, but there is also slicing: object[i:j:skip]
When you implement __getitem__(self, index)
, index
will simply be the index if it’s a simple index, but if it’s slicing, it will be a slice
object. Python also supports multiple slices:
object[a:b,c:d]
These are used in numpy to support multi-dimensional arrays, for instance.
In this case, a tuple of slice objects is passed in.
See: index_slicing.py
Callable Classes
We’ve been using functions a lot:
def my_fun(something):
do_something()
...
return something
And then we can call it:
result = my_fun(some_arguments)
But what if we need to store some data to know how to evaluate that function? For example: a function that computes a quadratic function:
You could pass in a, b and c each time:
def quadratic(x, a, b, c):
return a * x**2 + b * x + c
But what if you are using the same a, b, and c numerous times?
Or what if you need to pass this in to something (like map
) that requires a function that takes a single argument?
“Callables”
Various places in Python expect a “callable” – something that you can call like a function:
a_result = something(some_arguments)
“Something” in this case is often a function, but can be anything else that is “callable”.
What have we been introduced to recently that is “callable”, but not a function object?
Custom Callable Objects
The trick is one of Python’s “magic methods”.
__call__(*args, **kwargs)
If you define a __call__
method in your class, it will be used when code “calls” an instance of your class:
class Callable:
def __init__(self):
some_initilization()
def __call__(self, some_parameters):
...
Then you can do:
callable_instance = Callable(some_arguments)
result = callable_instance(some_arguments)
Callable Example
Here is an example of writing a callable class. We are going to write a class for a quadratic equation.
The initializer for that class should take the parameters:
a, b, c
It should store those parameters as attributes.
The resulting instance should evaluate the function when called, and return the result
my_quad = Quadratic(a=2, b=3, c=1)
my_quad(0)
Here’s one way to do that:
quadratic.py
Protocols in Summary
Use special methods when you want your class to act like a “standard” type or class in some way.
Look up the special methods you need and define them. But only define the ones you need.
There’s more to read about the details of implementing these methods: https://docs.python.org/3/reference/datamodel.html#special-method-names
References
Here is a good reference for magic methods: http://minhhh.github.io/posts/a-guide-to-pythons-magic-methods
And with a bit more explanation: https://www.python-course.eu/python3_magic_methods.php