The usage of Ajax in Python3 crawler

Ajax, the full name is Asynchronous JavaScript and XML, that is, asynchronous JavaScript and XML. It is not a programming language, but a technology that uses JavaScript to exchange data with the server and update part of the web page while ensuring that the page will not be refreshed and the page link will not change.

For traditional web pages, if you want to update its content, you must refresh the entire page, but with Ajax, you can update its content without the page being completely refreshed. In this process, the page actually performs data interaction with the server in the background. After the data is obtained, JavaScript is used to change the page, so that the content of the page will be updated.

You can experience a few examples on W3School: http://www.w3school.com.cn/ajax/ajax_xmlhttprequest_send.asp.

1. Example introduction

When browsing the web, we will find that many web pages have scroll down to view more options. For example, take Weibo as an example. Let’s take my personal homepage as an example: https://m.weibo.cn/u/2830678474, switch to the Weibo page, and keep going down, you can find that after dropping a few Weibo, There will be no more down, instead, a loading animation will appear. After a while, new Weibo content will continue to appear below. This process is actually the process of Ajax loading, as shown in Figure 6-1.

We noticed that the page has not been completely refreshed, which means that the link on the page has not changed, but there is new content on the page, which is the new Weibo that was posted later. This is the process of obtaining new data and presenting it through Ajax.

2. Fundamental

After a preliminary understanding of Ajax, let's learn more about its basic principles. The process of sending an Ajax request to a web page update can be simply divided into the following 3 steps:

(1) Send the request; (2) parse the content; (3) render the web page.

We will introduce these processes in detail below.

send request

We know that JavaScript can implement various interactive functions of the page, and Ajax is no exception. It is also implemented by JavaScript. In fact, the following code is executed:

varxmlhttp;if(window.XMLHttpRequest){//codeforIE7+,Firefox,Chrome,Opera,Safari
xmlhttp=newXMLHttpRequest();}else{//codeforIE6,IE5
xmlhttp=newActiveXObject("Microsoft.XMLHTTP");}
xmlhttp.onreadystatechange=function(){if(xmlhttp.readyState==4&&xmlhttp.status==200){
document.getElementById("myDiv").innerHTML=xmlhttp.responseText;}}
xmlhttp.open("POST","/ajax/",true);
xmlhttp.send();

This is JavaScript's lowest-level implementation of Ajax. In fact, it creates a new XMLHttpRequest object, then calls the onreadystatechange property to set the listener, and then calls the open() and send() methods to send a request to a link (that is, the server). After the request is sent in Python, the response result can be obtained, but the sending of the request is completed by JavaScript. Because the monitoring is set, when the server returns a response, the method corresponding to onreadystatechange will be triggered, and then in this method Just parse the response content.

Parse content

After getting the response, the method corresponding to the onreadystatechange attribute will be triggered. At this time, the response content can be obtained by using the responseText attribute of xmlhttp. This is similar to the process of using requests to initiate a request to the server in Python and then get a response. Then the returned content may be HTML or JSON, and then you only need to use JavaScript in the method for further processing. For example, if it is JSON, it can be parsed and transformed.

Render webpage

JavaScript has the ability to change the content of a web page. After parsing the response content, you can call JavaScript to perform the next step of processing the web page based on the parsed content. For example, through an operation such as document.getElementById().innerHTML, you can change the source code in an element, so that the content displayed on the web page is changed. This operation is also called DOM operation, that is, the document web page Document operations, such as changing, deleting, etc.

In the above example, document.getElementById("myDiv").innerHTML=xmlhttp.responseText will change the HTML code inside the node with ID myDiv to the content returned by the server, so that the new data returned by the server will appear inside the myDiv element. Parts of the webpage appear to be updated.

We observe that these 3 steps are actually completed by JavaScript, which completes the entire request, parsing, and rendering process.

Recalling the pull-down refresh of Weibo again, this is actually JavaScript sending an Ajax request to the server, and then obtaining new Weibo data, parse it, and render it on the web page.

Therefore, we know that the real data is actually obtained from Ajax requests time after time. If you want to grab this data, you need to know how these requests are sent, where they are sent, and what parameters are sent. If we know this, can we simulate this sending operation with Python and get the result?

In the next section, we will come to understand where to see these background Ajax operations, to understand how it is sent and what parameters are sent.

The above is the detailed content of the usage of Ajax in Python3 crawler. For more information about what Ajax is in Python3, please pay attention to other related articles on ZaLou.Cn!

Recommended Posts

The usage of Ajax in Python3 crawler
The usage of wheel in python
The usage of tuples in python
The meaning and usage of lists in python
Detailed usage of dictionary in Python
Usage of os package in python
Understanding the meaning of rb in python
MongoDB usage in Python
What is the function of adb in python
The usage of several regular expressions in Linux
Python implements the shuffling of the cards in Doudizhu
The usage of several regular expressions in Linux
Detailed explanation of the usage of Python decimal module
The usage of several regular expressions in Linux
The consequences of uninstalling python in ubuntu, very
Consolidate the foundation of Python (4)
Consolidate the foundation of Python(7)
Subscripts of tuples in Python
Consolidate the foundation of Python(6)
Analysis of JS of Python crawler
Consolidate the foundation of Python(5)
How to understand the introduction of packages in Python
Consolidate the foundation of Python (3)
How to find the area of a circle in python
Learn the hard core operation of Python in one minute
Summary of logarithm method in Python
Analysis of usage examples of Python yield
Use of Pandas in Python development
Python handles the 4 wheels of Chinese
Use of numpy in Python development
Python simulation of the landlord deal
What is the use of Python
Scrapy simulation login of Python crawler
Mongodb and python interaction of python crawler
​What are the numbers in Python?
Simple usage of python definition class
Talking about the modules in Python
The premise of Python string pooling
Secrets of the new features of Python 3.8
Learning path of python crawler development
The father of Python joins Microsoft
The operation of python access hdfs
End the method of running python
Description of in parameterization in python mysql
Can Python implement the structure of the stack?
Learn the basics of python interactive mode
Implementation of JWT user authentication in python
Logistic regression at the bottom of python
Python solves the Tower of Hanoi game
Solve the conflict of multiple versions of python
What is the scope of python variables
Python implements the sum of fractional sequences
Two days of learning the basics of Python
What is the id function of python
Analysis of glob in python standard library
Where is the pip path of python3
Method of installing django module in python
The essence of Python language: Itertools library
What are the advantages of python language
The specific method of python instantiation object
python3 realizes the function of mask drawing