Ajax, the full name is Asynchronous JavaScript and XML, that is, asynchronous JavaScript and XML. It is not a programming language, but a technology that uses JavaScript to exchange data with the server and update part of the web page while ensuring that the page will not be refreshed and the page link will not change.
For traditional web pages, if you want to update its content, you must refresh the entire page, but with Ajax, you can update its content without the page being completely refreshed. In this process, the page actually performs data interaction with the server in the background. After the data is obtained, JavaScript is used to change the page, so that the content of the page will be updated.
You can experience a few examples on W3School: http://www.w3school.com.cn/ajax/ajax_xmlhttprequest_send.asp.
1. Example introduction
When browsing the web, we will find that many web pages have scroll down to view more options. For example, take Weibo as an example. Let’s take my personal homepage as an example: https://m.weibo.cn/u/2830678474, switch to the Weibo page, and keep going down, you can find that after dropping a few Weibo, There will be no more down, instead, a loading animation will appear. After a while, new Weibo content will continue to appear below. This process is actually the process of Ajax loading, as shown in Figure 6-1.
We noticed that the page has not been completely refreshed, which means that the link on the page has not changed, but there is new content on the page, which is the new Weibo that was posted later. This is the process of obtaining new data and presenting it through Ajax.
2. Fundamental
After a preliminary understanding of Ajax, let's learn more about its basic principles. The process of sending an Ajax request to a web page update can be simply divided into the following 3 steps:
(1) Send the request; (2) parse the content; (3) render the web page.
We will introduce these processes in detail below.
send request
We know that JavaScript can implement various interactive functions of the page, and Ajax is no exception. It is also implemented by JavaScript. In fact, the following code is executed:
varxmlhttp;if(window.XMLHttpRequest){//codeforIE7+,Firefox,Chrome,Opera,Safari
xmlhttp=newXMLHttpRequest();}else{//codeforIE6,IE5
xmlhttp=newActiveXObject("Microsoft.XMLHTTP");}
xmlhttp.onreadystatechange=function(){if(xmlhttp.readyState==4&&xmlhttp.status==200){
document.getElementById("myDiv").innerHTML=xmlhttp.responseText;}}
xmlhttp.open("POST","/ajax/",true);
xmlhttp.send();
This is JavaScript's lowest-level implementation of Ajax. In fact, it creates a new XMLHttpRequest object, then calls the onreadystatechange property to set the listener, and then calls the open() and send() methods to send a request to a link (that is, the server). After the request is sent in Python, the response result can be obtained, but the sending of the request is completed by JavaScript. Because the monitoring is set, when the server returns a response, the method corresponding to onreadystatechange will be triggered, and then in this method Just parse the response content.
Parse content
After getting the response, the method corresponding to the onreadystatechange attribute will be triggered. At this time, the response content can be obtained by using the responseText attribute of xmlhttp. This is similar to the process of using requests to initiate a request to the server in Python and then get a response. Then the returned content may be HTML or JSON, and then you only need to use JavaScript in the method for further processing. For example, if it is JSON, it can be parsed and transformed.
Render webpage
JavaScript has the ability to change the content of a web page. After parsing the response content, you can call JavaScript to perform the next step of processing the web page based on the parsed content. For example, through an operation such as document.getElementById().innerHTML, you can change the source code in an element, so that the content displayed on the web page is changed. This operation is also called DOM operation, that is, the document web page Document operations, such as changing, deleting, etc.
In the above example, document.getElementById("myDiv").innerHTML=xmlhttp.responseText will change the HTML code inside the node with ID myDiv to the content returned by the server, so that the new data returned by the server will appear inside the myDiv element. Parts of the webpage appear to be updated.
We observe that these 3 steps are actually completed by JavaScript, which completes the entire request, parsing, and rendering process.
Recalling the pull-down refresh of Weibo again, this is actually JavaScript sending an Ajax request to the server, and then obtaining new Weibo data, parse it, and render it on the web page.
Therefore, we know that the real data is actually obtained from Ajax requests time after time. If you want to grab this data, you need to know how these requests are sent, where they are sent, and what parameters are sent. If we know this, can we simulate this sending operation with Python and get the result?
In the next section, we will come to understand where to see these background Ajax operations, to understand how it is sent and what parameters are sent.
The above is the detailed content of the usage of Ajax in Python3 crawler. For more information about what Ajax is in Python3, please pay attention to other related articles on ZaLou.Cn!
Recommended Posts