Python data model

1. How to understand the data model?

Recently I was reading a book dedicated to Python language features (part of the content in this article comes from the book Fluent Python). The word data model is mentioned in the book. Is the data model what we often say? type of data? In fact, it is not. The data model is a description of the Python framework. It regulates the interfaces of its own building blocks. These interfaces can be understood as special methods in Python, such as __iter__, __len__, __del__, etc. These modules include but are not limited to sequences, iterators, functions, classes, and context managers. If we are discussing which methods and properties of an object can be called a sequence, in fact we are discussing the data model of a sequence.

**One of the best qualities of Python is consistency. After you use Python for a period of time, you will begin to understand the Python language and be able to correctly guess the new language features for you. **

If you use Python with experience in other programming languages, you may be curious about len(object) instead of object.len(). You often use object.len() in the Java language. This method gets the length of the object. **When you further understand the power behind this discomfort, you will be convinced by the design philosophy of Python, which is the result of building on the Python data model, the API of the Python data model, for us Using authentic Python features to build objects provides tools, which is what we often call **Pythonic**. **

No matter which framework you write in, you will spend a lot of time in implementing methods that will be called by the framework itself, and the Python framework itself is no exception. When you are using object[item], you actually call the method object.__getitem__ behind the scenes. What is the advantage of this, so that we can use the [] operator on self-built objects, we only need to define the __getitem__ method in the class.

__ getitem__ is one of Python's special methods. Common special methods include __len__, __iter__, __enter__, __call__, etc.

These special methods allow your own objects to implement and support the following language architectures and interact with them.

Second, implement your own sequence class####

The data model provides an API for building objects using Python language features, so we try to implement our own sequence class.

classMyList(object):
    
 def __init__(self,):
  self.items =[str(x)for x inrange(10)]
  self.len =10

 def __getitem__(self, item):return self.items[item]

 def __len__(self):return self.len

a =MyList()for x in a:print(x)print('The first element', a[0])print("length:",len(a))

Our self-defined class supports iteration, slicing, and the len method. This is actually the protocol (informal interface) that implements the Python sequence. We have implemented a custom sequence through the data model instead of subclassing built-in Sequence, which is actually a language of the duck model, when a certain class behaves like a duck (here refers to a sequence), then this class can be said to be a sequence. It doesn't care whether it is achieved through subclassing or serial protocol.

We can already appreciate the benefits of using the Python data model by using special methods. As a user of your class, you don’t have to remember the various names of standard operations ("How to get the length?" Is it .size() or .length( ) Or something else?)

In the above example, the MyList class can be iterated and sliced. The function of slicing is provided by __getitem__, and the function of iteration is actually provided by __iter__, which returns an iterable object. But there is no __iter__ method in the MyList class, what is going on?

Because if the __iter__ method is not implemented, but the __getitem__ method is implemented, Python will create an iterator and try to get the elements in order (starting from index 0).

Now we try to print the a object

classMyList(object):
    
 def __init__(self,):
  self.items =[str(x)for x inrange(10)]
  self.len =10

 def __getitem__(self, item):return self.items[item]

 def __len__(self):return self.len

a =MyList()print(a)
>><__ main__.MyList object at 0x0000015AEAFC95F8>

The result is a string of memory addresses that we don’t care about, but we often find that some Python built-in modules, or third-party libraries, when we directly print an object, the output is information that we can understand, not like this Memory address.

In fact, this can be achieved by the special method __str__, which returns a string. We can do this by adding a special method __str__ to the MyList class.

classMyList(object):

 def __init__(self,):
  self.items =[str(x)for x inrange(10)]
  self.len =10

 def __getitem__(self, item):return self.items[item]

 def __len__(self):return self.len

 def __str__(self):return','.join(self.items)

a =MyList()print(a)
>>0,1,2,3,4,5,6,7,8,9

3. Why len is not an ordinary method

If x is an instance of a built-in type, then len(x) will be very fast. The reason behind it is that CPython reads the length of the object directly from a C structure, without calling any methods at all. Obtaining the number of elements in a collection is a very common operation. This operation must be efficient for str, list, memoryview and other types. In other words, the reason why len is not an ordinary method is to allow Python's own data structure to go through the back door, and the same is true for abs. But thanks to its special method, we can also use len for custom data types. This approach finds a balance between maintaining the efficiency of built-in types and ensuring the consistency of the language, and also confirms another sentence in "Python": "You can't make special cases so special that they start to break established rules."

Four, data model and special methods

The data model describes the object protocol, and the special method is the protocol implemented by the built-in object. In order to make our code style behave the same as the built-in type, or more Python-style code, we can use special methods instead of Subclassing.

A basic requirement of an object is that it must have a reasonable string representation. We can meet this requirement through __str__ and __repr__. __repr__ is convenient for us to debug, and __str__ is suitable for end users. This is why there are special methods __repr__ and __str__ in the data model.

There are many special methods in Python. The main topic here is the data model. I hope you can understand the design philosophy of the Python language and think about how to write more Pythonic code.

Recommended Posts

Python data model
02. Python data types
Python data analysis
python data structure
Python data format-CSV
Python data analysis-data update
Python data analysis-apply function
Python data analysis-data selection
Python basic data types
Python basic data types
Python data analysis-data establishment
Python Data Science: Neural Networks
Python common data structure collation
Python3 crawler data cleaning analysis
Python parses simple XML data
Python Data Science: Logistic Regression
Python data structure and algorithm
Python Data Science: Regularization Methods
Python Data Science: Related Analysis
Python Data Science: Linear Regression
Python Faker data forgery module
Python Data Science: Chi-Square Test
Python realizes online microblog data visualization
Is python suitable for data mining
Python multithreading
Python CookBook
Python FAQ
Python3 dictionary
python (you-get)
Automatically generate data analysis report with Python
Python string
Python access to npy format data examples
Python basics
Python descriptor
Python basics 2
Python exec
Python notes
Python3 tuple
CentOS + Python3.6+
Python advanced (1)
Python IO
Python multithreading
Python toolchain
Python3 list
Python multitasking-coroutine
Python overview
Python analytic
Python basics
07. Python3 functions
Python basics 3
Java or Python for big data analysis
Python multitasking-threads
Python uses pandas to process Excel data
Python functions
python sys.stdout
python operator
Python entry-3
Centos 7.5 python3.6
Python string
python queue Queue
Python basics 4