Keep Your Types in Check

Abhijit Gupta
The Startup
Published in
4 min readSep 9, 2020

--

You are not alone if you often get frustrated by “duck typing” in Python. By duck typing I mean if something walks like a duck and quacks like a duck, then, for sure, it is a duck! The apparent laxity in Python type system or its absence thereof is a potential source of huge number of bugs that pop up in the production time. Well, teams routinely employ robust “testing” to catch them but it’s usually never enough. Without strict type checking at compile time, more often than not bugs get caught only at the run time, which is ultimately not desirable. There are several ways to obviate this problem, and I am going to discuss a robust but less known one — descriptors.

Let’s start with a very simple class named Point that takes abstracts a 3D coordinate.

class Point:
"""
A point in 3D space
"""
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z

def __repr__(self):
return f"Point({self.x}, {self.y}, {self.z})"
...

This class has no type checking as such. One can instantiate with any valid datatype in Python, which obviously leads to not so obvious problems.

Point("x", 1.0, 3)

Without strict type checking, one can instantiate the class as mentioned above, which is not what the end user might desire. How can we address this problem without making substantial modification? A simple, albeit not so elegant approach is to use assertions.

assert isinstance(self.x, float) and isinstance(self.y, float) and isinstance(self.z, float)

This assertion will coerce the user to instantiate this class with only float arguments. However, as you might have guessed, this is not a very flexible approach and we might have to write several assertions to satisfy what we want. As a matter of fact, one can even envision writing try — except block for type converting the user supplied arguments in our __init__ function. This again has several pitfalls and code becomes unnecessarily complex and long.

A better way to accomplish type checking is creating managed attributes. We customise access to an attribute by defining it as a property.

class Point:
"""
A point in 3D space
"""
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z

# Getter
@property
def x(self):
return self._x

# Setter
@x.setter
def x(self, value):
if not isinstance(value, (int,float)):
raise TypeError("Expected a float or int")
else:
self._x = float(value)
# Getter
@property
def y(self):
return self._y

# Setter
@y.setter
def y(self, value):
if not isinstance(value, (int,float)):
raise TypeError("Expected a float or int")
else:
self._y = float(value)
... def __repr__(self):
return f"Point({self.x}, {self.y}, {self.z})"

The above piece of code defines x, y, and z as property and add type checking followed by converting arguments to them into float in the setter function. Note that we need to define x,y and z as property in the getter function before we can define the setter function. Now, if a user tries to access x, y or z attribute, that automatically triggers the getter and setter methods. There is an optional deleter method that I have not defined above, as it is optional and prevents accidental deletion of crucial attributes by the user of our code. When a user accesses/modifies any of the property, our program manipulates self._x, self._y or self._z attribute directly, which is where the actual data lives.

Use of property for type checking is a good approach when you few attributes, but this quickly leads to a bloated code when you try to define property for all attributes.

A better and robust way to type checking is making use of descriptor class. Here, I will define a descriptor class for a float type-checked attribute.

class Float:
def __init__(self, name):
self.name = name
def __get__(self, instance, cls):
if instance is None:
return self
else:
return instance.__dict__[self.name]
def __set__(self, instance, value): if not isinstance(value, float):
raise TypeError("Expected a float arg")
instance.__dict__[self.name] = value
def __delete__(self, instance):
del instance.__dict__[self.name]

Let’s dissect this class, piece by piece, so to speak:

Essentially, a descriptor class implements three crucial attribute access operations(get, set and delete), which are defined as __get__(), __set__(), and __delete__() respectively. These methods are special methods, as noted by double underscores in their names. They all receive and instance of a class as input and then they manipulate the underlying dictionary of the instance appropriately (instance.__dict__ is the dictionary I am referring to).

How can we use it in our code? We need to place the instances of the descriptor class as class variables into the definition of Point class.

class Point:

x = Float("x")
y = Float("y")
z = Float("z")
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z

def __repr__(self):
return f"Point({self.x!r}, {self.y!r}, {self.z!r})"

Now, all access to descriptor attributes (x, y or z) is captured by __get__(), __set__() and __delete__() methods. They all “indirectly” manipulate the Point object dictionary.

>>> p1 = Point(2., 3., 0.)
>>> p1.z # it calls Point.z.__get__(p1, Point)
0.0
>>> p1.y = 5
(This raises TypeError("Expected float as arg"))

We can further refactor our code by defining a decorator:

def FloatTypeOnly(*args):
def decorate(cls):
for name in args:
setattr(cls, name, Float(name))
return cls
return decorate

This decorator can then be directly used when defining the Point class:

@FloatTypeOnly("x", "y", "z")
class Point:
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z

def __repr__(self):
return f"Point({self.x!r}, {self.y!r}, {self.z!r})"

To conclude, you might ponder about the obfustification introduced by adding descriptors for type checking and ask yourself the question whether it is really necessary to use them. For sure, there are alternate ways to do type checking like using third property libraries — mypy, but they cannot enforce strict type checking and are not compatible with all custom data types that you might intend to use in your code. Furthermore, by using descriptors you can completely customise what they do at a very low level, which is sometimes invaluable for debugging and imlementing custom behaviors.

--

--