Recently I was reading a book dedicated to Python language features (part of the content in this article comes from the book Fluent Python). The word data model is mentioned in the book. Is the data model what we often say? type of data? In fact, it is not. The data model is a description of the Python framework. It regulates the interfaces of its own building blocks. These interfaces can be understood as special methods in Python, such as __iter__
, __len__
, __del__
, etc. These modules include but are not limited to sequences, iterators, functions, classes, and context managers. If we are discussing which methods and properties of an object can be called a sequence, in fact we are discussing the data model of a sequence.
**One of the best qualities of Python is consistency. After you use Python for a period of time, you will begin to understand the Python language and be able to correctly guess the new language features for you. **
If you use Python with experience in other programming languages, you may be curious about len(object)
instead of object.len()
. You often use object.len()
in the Java
language. This method gets the length of the object. **When you further understand the power behind this discomfort, you will be convinced by the design philosophy of Python, which is the result of building on the Python data model, the API of the Python data model, for us Using authentic Python features to build objects provides tools, which is what we often call **Pythonic**
. **
No matter which framework you write in, you will spend a lot of time in implementing methods that will be called by the framework itself, and the Python framework itself is no exception. When you are using object[item]
, you actually call the method object.__getitem__
behind the scenes. What is the advantage of this, so that we can use the []
operator on self-built objects, we only need to define the __getitem__
method in the class.
__ getitem__
is one of Python's special methods. Common special methods include __len__
, __iter__
, __enter__
, __call__
, etc.
These special methods allow your own objects to implement and support the following language architectures and interact with them.
The data model provides an API for building objects using Python language features, so we try to implement our own sequence class.
classMyList(object):
def __init__(self,):
self.items =[str(x)for x inrange(10)]
self.len =10
def __getitem__(self, item):return self.items[item]
def __len__(self):return self.len
a =MyList()for x in a:print(x)print('The first element', a[0])print("length:",len(a))
Our self-defined class supports iteration, slicing, and the len
method. This is actually the protocol (informal interface) that implements the Python sequence. We have implemented a custom sequence through the data model instead of subclassing built-in Sequence, which is actually a language of the duck model, when a certain class behaves like a duck (here refers to a sequence), then this class can be said to be a sequence. It doesn't care whether it is achieved through subclassing or serial protocol.
We can already appreciate the benefits of using the Python data model by using special methods. As a user of your class, you don’t have to remember the various names of standard operations ("How to get the length?" Is it .size() or .length( ) Or something else?)
In the above example, the MyList
class can be iterated and sliced. The function of slicing is provided by __getitem__
, and the function of iteration is actually provided by __iter__
, which returns an iterable object. But there is no __iter__
method in the MyList
class, what is going on?
Because if the __iter__
method is not implemented, but the __getitem__
method is implemented, Python will create an iterator and try to get the elements in order (starting from index 0).
Now we try to print the a
object
classMyList(object):
def __init__(self,):
self.items =[str(x)for x inrange(10)]
self.len =10
def __getitem__(self, item):return self.items[item]
def __len__(self):return self.len
a =MyList()print(a)
>><__ main__.MyList object at 0x0000015AEAFC95F8>
The result is a string of memory addresses that we don’t care about, but we often find that some Python built-in modules, or third-party libraries, when we directly print an object, the output is information that we can understand, not like this Memory address.
In fact, this can be achieved by the special method __str__
, which returns a string. We can do this by adding a special method __str__
to the MyList
class.
classMyList(object):
def __init__(self,):
self.items =[str(x)for x inrange(10)]
self.len =10
def __getitem__(self, item):return self.items[item]
def __len__(self):return self.len
def __str__(self):return','.join(self.items)
a =MyList()print(a)
>>0,1,2,3,4,5,6,7,8,9
If x is an instance of a built-in type, then len(x) will be very fast. The reason behind it is that CPython reads the length of the object directly from a C structure, without calling any methods at all. Obtaining the number of elements in a collection is a very common operation. This operation must be efficient for str, list, memoryview and other types. In other words, the reason why len is not an ordinary method is to allow Python's own data structure to go through the back door, and the same is true for abs. But thanks to its special method, we can also use len for custom data types. This approach finds a balance between maintaining the efficiency of built-in types and ensuring the consistency of the language, and also confirms another sentence in "Python": "You can't make special cases so special that they start to break established rules."
The data model describes the object protocol, and the special method is the protocol implemented by the built-in object. In order to make our code style behave the same as the built-in type, or more Python-style code, we can use special methods instead of Subclassing.
A basic requirement of an object is that it must have a reasonable string representation. We can meet this requirement through __str__
and __repr__
. __repr__
is convenient for us to debug, and __str__
is suitable for end users. This is why there are special methods __repr__
and __str__
in the data model.
There are many special methods in Python. The main topic here is the data model. I hope you can understand the design philosophy of the Python language and think about how to write more Pythonic
code.
Recommended Posts