What and why is descriptor ?

I have just been working with Python for more than one year. Basically, it’s a very nice programming language with short and clear syntax. However, when I tried to write some data objects for Posify project, there is the beginning of many troubles.

In Python, an object have not to define its attributes. Actually, you can define object’s attributes when init an object or add more attributes whenever after init with any type you want (such an easygoing language).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Dog(object):
def __init__(self, kind: str, age: int, email: str):
self.kind = kind
self.age = age
self.email = email

>>> husky = Dog("Husky")
>>> husky.kinde
Husky

>>> husky.name = "Mic"
>>> husky.name
Mic

>>> husky.kind = 150 # Wrong usage but still work
>>> husky.kind
150

>>> husky.email = "Wow, doge don't use email. Such an invalid email string. Wow."
>>> husky.email
Wow, doge don't use email. Such an invalid email string. Wow.

This is really hurt when I want to create data object, that means I must handle what attributes should be in an object and its type. I just missed writing a function to verify each object (its attributes and its attributes’ type), then put it in every __init__ like:

1
2
3
4
5
6
7
8
9
10
class Dog(object):
def __init__(self, kind: str, age: int, email: str):
assert isinstance(kind, str), "Dog's kind must be a str"
self.kind = kind

assert isinstance(age, int), "Dog's age must be an int"
self.age = age

assert validator.email(email), "Dog's email is invalid"
self.email = email

It’s so ugly, right? And when I look at Mongoengine ORM, it’s a wonderful implemetation:

1
2
3
4
class Dog(Base):
kind = StringField()
age = IntegerField()
email = EmailField()

I asked myself how to build my object like this, and I find out Descriptors. Descriptor’s definition is an object attribute with “binding behavior”. Yes, “binding behavior” is what we want here, so let’s create some descriptor.

First, let take a look in Python’s object built-in. There are many special method, but we just need three method to create descriptors called to descriptor protocal: get(), set() and delete(). In another words, if any of those methods are defined for an object, it is said to be a descriptor.

Let’s craft some Descriptors

So, I keep in my mind the words “binding behavior”, and look at Mongoengine code, it’s seem like create an object with overrided set() and get() function, so I try:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class StringField(object):
def __set__(self, instance, value):
if not isinstance(value, str):
raise ValueError("Value must be string")
self.value = value

def __get__(self, instance, owner):
return self.value

class Dog(object):
kind = StringField()

>>> a = Dog()
>>> a.kind = "Husky"

>>> b = Dog()
>>> b.kind = 10 # ValueError

Yay, we done. Descriptor is so easy. OR NOT? Now, just change some code and see:

1
2
3
4
5
6
>>> a = Dog()
>>> b = Dog()
>>> a.kind = "Husky"

>>> b.kind
Husky

Omg, it’s went wrong. This problem caused by all Dog instances share the same kind instance. When one Dog set his kind, another Dog will share it. It’s so frustrate.

So we must store kind for each separated Dog, something like dictionary with key is object and value is attributes to avoid this misunderstood. OK, let go deeper in this treasure hunting and find out the magic behind Descriptor.

(Part II will coming soon …)