Python—requests module detailed explanation

1、 Module description###

requests is an HTTP library under the Apache2 licensed license.

Written in python.

More concise than urllib2 module.

Request supports HTTP connection retention and connection pooling, supports the use of cookies to maintain sessions, supports file uploads, supports automatic response content encoding, and supports internationalized URL and POST data automatic encoding.

A high degree of encapsulation is carried out on the basis of python's built-in modules, so that when python makes network requests, it becomes humane. Using Requests can easily complete any operation that the browser can have.

Modern, international and friendly.

requests will automatically implement persistent connection keep-alive

2、 Getting Started

1 ) Import module

import requests

2 ) Conciseness of sending request

Sample code: Get a web page (personal github)

import requests

r = requests.get('https://github.com/Ranxf')       #The most basic get request without parameters
r1 = requests.get(url='http://dict.baidu.com/s', params={'wd':'python'})      #Get request with parameters

We can use the following methods in this way

1 requests.get(‘https://github.com/timeline.json’)                                #GET request
2 requests.post(“http://httpbin.org/post”)                                        #POST request
3 requests.put(“http://httpbin.org/put”)                                          #PUT request
4 requests.delete(“http://httpbin.org/delete”)                                    #DELETE request
5 requests.head(“http://httpbin.org/get”)                                         #HEAD request
6 requests.options(“http://httpbin.org/get” )                                     #OPTIONS request

3 ) Pass parameters for url

>>> url_params ={'key':'value'}       #The dictionary passes parameters, if the value is None, the key will not be added to the url
>>> r = requests.get('your url',params = url_params)>>>print(r.url)
  your url?key=value

4 ) Content of response

r.encoding                       #Get the current encoding
r.encoding ='utf-8'             #Set encoding
r.text                           #Parse the returned content with encoding. The response body in string mode will be automatically decoded according to the character encoding of the response header.
r.content                        #Return in byte form (binary). The response body in byte format will automatically decode gzip and deflate compression for you.

r.headers                        #The server response header is stored as a dictionary object, but this dictionary is special. The dictionary key is not case sensitive. If the key does not exist, it returns None

r.status_code                     #Response status code
r.raw                             #Return the original response body, which is the response object of urllib, using r.raw.read()   
r.ok                              #View r.The boolean value of ok can know whether the login is successful
 #* Special method*#
r.json()                         #Built-in JSON decoder in Requests, returned in json form,The content returned by the premise must be in json format, otherwise an exception will be thrown if parsing errors
r.raise_for_status()             #Failed request(Non-200 response)Throw an exception

Post json request:

1 import requests
2 import json
34 r = requests.post('https://api.github.com/some/endpoint', data=json.dumps({'some':'data'}))5print(r.json())

5 ) Custom header and cookie information

header ={'user-agent':'my-app/0.0.1''}
cookie ={'key':'value'}
 r = requests.get/post('your url',headers=header,cookies=cookie)
data ={'some':'data'}
headers ={'content-type':'application/json','User-Agent':'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:22.0) Gecko/20100101 Firefox/22.0'}
 
r = requests.post('https://api.github.com/some/endpoint', data=data, headers=headers)print(r.text)

6 ) Response status code

After using the requests method, a response object will be returned, which stores the content of the server response, such as the r.text, r.status_code mentioned in the above example...
Example of getting the response body in text mode: When you access r.text, the text encoding of the response will be used for decoding, and you can modify its encoding to let r.text use a custom encoding for decoding.

1 r = requests.get('http://www.itwhy.org')2print(r.text,'\n{}\n'.format('*'*79), r.encoding)3 r.encoding ='GBK'4print(r.text,'\n{}\n'.format('*'*79), r.encoding)

Sample code:

1 import requests
23 r = requests.get('https://github.com/Ranxf')       #The most basic get request without parameters
4 print(r.status_code)                               #Get return status
5 r1 = requests.get(url='http://dict.baidu.com/s', params={'wd':'python'})      #Get request with parameters
6 print(r1.url)7print(r1.text)        #Print the decoded return data

operation result:

/usr/bin/python3.5/home/rxf/python3_1000/1000/python3_server/python3_requests/demo1.py
200
http://dict.baidu.com/s?wd=python
…………

Process finished with exit code 0
 r.status_code                      #If it is not 200, you can use r.raise_for_status()Throw an exception

7 )response

r.headers                                  #Return dictionary type,Header information
r.requests.headers                         #Return the header information sent to the server
r.cookies                                  #Return cookie
r.history                                  #Return redirect information,Of course you can add allow to the request_redirects =false prevents redirection

8 )time out

r = requests.get('url',timeout=1)           #Set timeout in seconds, only valid for connection

9) Session object, able to maintain certain parameters across requests

s = requests.Session()
s.auth =('auth','passwd')
s.headers ={'key':'value'}
r = s.get('url')
r1 = s.get('url1')

10 )proxy

proxies ={'http':'ip1','https':'ip2'}
requests.get('url',proxies=proxies)

Summary:

# HTTP request type
# get type
r = requests.get('https://github.com/timeline.json')
# post type
r = requests.post("http://m.ctrip.com/post")
# put type
r = requests.put("http://m.ctrip.com/put")
# delete type
r = requests.delete("http://m.ctrip.com/delete")
# head type
r = requests.head("http://m.ctrip.com/head")
# options type
r = requests.options("http://m.ctrip.com/get")

# Get response content
print(r.content) #Display in bytes, Chinese as characters
print(r.text) #Display in text

# URL passing parameters
payload ={'keyword':'Hong Kong','salecityid':'2'}
r = requests.get("http://m.ctrip.com/webapp/tourvisa/visa_list", params=payload) 
print(r.url) #Example is http://m.ctrip.com/webapp/tourvisa/visa_list?salecityid=2&keyword=Hong Kong

# Obtain/Modify web page encoding
r = requests.get('https://github.com/timeline.json')
print (r.encoding)

# json processing
r = requests.get('https://github.com/timeline.json')
print(r.json()) #Need to import json first

# Custom request header
url ='http://m.ctrip.com'
headers ={'User-Agent':'Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 4 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.166 Mobile Safari/535.19'}
r = requests.post(url, headers=headers)
print (r.request.headers)

# Complex post request
url ='http://m.ctrip.com'
payload ={'some':'data'}
r = requests.post(url, data=json.dumps(payload)) #If the payload passed is a string instead of a dict, you need to call the dumps method to format it first

# post multi-part encoded file
url ='http://m.ctrip.com'
files ={'file':open('report.xls','rb')}
r = requests.post(url, files=files)

# Response status code
r = requests.get('http://m.ctrip.com')print(r.status_code)
    
# Response header
r = requests.get('http://m.ctrip.com')print(r.headers)print(r.headers['Content-Type'])print(r.headers.get('content-type')) #Two ways to access part of the response header
    
# Cookies
url ='http://example.com/some/cookie/setting/url'
r = requests.get(url)
r.cookies['example_cookie_name']    #Read cookies
    
url ='http://m.ctrip.com/cookies'
cookies =dict(cookies_are='working')
r = requests.get(url, cookies=cookies) #Send cookies

# Set timeout
r = requests.get('http://m.ctrip.com', timeout=0.001)

# Set access proxy
proxies ={"http":"http://10.10.1.10:3128","https":"http://10.10.1.100:4444",}
r = requests.get('http://m.ctrip.com', proxies=proxies)

# If the agent requires a username and password, it needs to be like this:
proxies ={"http":"http://user:[email protected]:3128/",}
# HTTP request type
# get type
r = requests.get('https://github.com/timeline.json')
# post type
r = requests.post("http://m.ctrip.com/post")
# put type
r = requests.put("http://m.ctrip.com/put")
# delete type
r = requests.delete("http://m.ctrip.com/delete")
# head type
r = requests.head("http://m.ctrip.com/head")
# options type
r = requests.options("http://m.ctrip.com/get")

# Get response content
print(r.content) #Display in bytes, Chinese as characters
print(r.text) #Display in text

# URL passing parameters
payload ={'keyword':'Hong Kong','salecityid':'2'}
r = requests.get("http://m.ctrip.com/webapp/tourvisa/visa_list", params=payload) 
print(r.url) #Example is http://m.ctrip.com/webapp/tourvisa/visa_list?salecityid=2&keyword=Hong Kong

# Obtain/Modify web page encoding
r = requests.get('https://github.com/timeline.json')
print (r.encoding)

# json processing
r = requests.get('https://github.com/timeline.json')
print(r.json()) #Need to import json first

# Custom request header
url ='http://m.ctrip.com'
headers ={'User-Agent':'Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 4 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.166 Mobile Safari/535.19'}
r = requests.post(url, headers=headers)
print (r.request.headers)

# Complex post request
url ='http://m.ctrip.com'
payload ={'some':'data'}
r = requests.post(url, data=json.dumps(payload)) #If the payload passed is a string instead of a dict, you need to call the dumps method to format it first

# post multi-part encoded file
url ='http://m.ctrip.com'
files ={'file':open('report.xls','rb')}
r = requests.post(url, files=files)

# Response status code
r = requests.get('http://m.ctrip.com')print(r.status_code)
    
# Response header
r = requests.get('http://m.ctrip.com')print(r.headers)print(r.headers['Content-Type'])print(r.headers.get('content-type')) #Two ways to access part of the response header
    
# Cookies
url ='http://example.com/some/cookie/setting/url'
r = requests.get(url)
r.cookies['example_cookie_name']    #Read cookies
    
url ='http://m.ctrip.com/cookies'
cookies =dict(cookies_are='working')
r = requests.get(url, cookies=cookies) #Send cookies

# Set timeout
r = requests.get('http://m.ctrip.com', timeout=0.001)

# Set access proxy
proxies ={"http":"http://10.10.1.10:3128","https":"http://10.10.1.100:4444",}
r = requests.get('http://m.ctrip.com', proxies=proxies)

# If the agent requires a username and password, it needs to be like this:
proxies ={"http":"http://user:[email protected]:3128/",}

3、 Sample code###

GET request###

1 # 1、 No parameter example
 23 import requests
 45 ret = requests.get('https://github.com/timeline.json')67print(ret.url)8print(ret.text)9101112 #2. There are parameter examples
1314 import requests
1516 payload ={'key1':'value1','key2':'value2'}17 ret = requests.get("http://httpbin.org/get", params=payload)1819print(ret.url)20print(ret.text)

POST request###

# 1、 Basic POST example
  
import requests
  
payload ={'key1':'value1','key2':'value2'}
ret = requests.post("http://httpbin.org/post", data=payload)print(ret.text)
  
  
# 2、 Send request header and data instance
  
import requests
import json
  
url ='https://api.github.com/some/endpoint'
payload ={'some':'data'}
headers ={'content-type':'application/json'}
  
ret = requests.post(url, data=json.dumps(payload), headers=headers)print(ret.text)print(ret.cookies)

Request parameters###

def request(method, url,**kwargs):"""Constructs and sends a :class:`Request <Request>`.:param method: method for the new:class:`Request` object.:param url: URL for the new:class:`Request` object.:param params:(optional) Dictionary or bytes to be sent in the query string for the :class:`Request`.:param data:(optional) Dictionary, bytes, or file-like object to send in the body of the :class:`Request`.:param json:(optional) json data to send in the body of the :class:`Request`.:param headers:(optional) Dictionary of HTTP Headers to send with the :class:`Request`.:param cookies:(optional) Dict or CookieJar object to send with the :class:`Request`.:param files:(optional) Dictionary of``'name': file-like-objects``(or ``{'name': file-tuple}``)for multipart encoding upload.``file-tuple`` can be a 2-tuple ``('filename', fileobj)``,3-tuple ``('filename', fileobj,'content_type')``
  or a 4-tuple ``('filename', fileobj,'content_type', custom_headers)``, where ``'content-type'`` is a string
  defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
  to add for the file.:param auth:(optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.:param timeout:(optional) How long to wait for the server to send data
  before giving up,as a float, or a :ref:`(connect timeout, read
  timeout) <timeouts>` tuple.:type timeout: float or tuple
 : param allow_redirects:(optional) Boolean. Set to True if POST/PUT/DELETE redirect following is allowed.:type allow_redirects: bool
 : param proxies:(optional) Dictionary mapping protocol to the URL of the proxy.:param verify:(optional) whether the SSL cert will be verified. A CA_BUNDLE path can also be provided. Defaults to ``True``.:param stream:(optional)if``False``, the response content will be immediately downloaded.:param cert:(optional)if String, path to ssl client cert file(.pem). If Tuple,('cert','key') pair.:return::class:`Response <Response>` object
 : rtype: requests.Response

 Usage::>>>import requests
  >>> req = requests.request('GET','http://httpbin.org/get')<Response [200]>"""

parameter list

Request parameter
def param_method_url():
 # requests.request(method='get', url='http://127.0.0.1:8000/test/')
 # requests.request(method='post', url='http://127.0.0.1:8000/test/')
 pass

def param_param():
 # - Can be a dictionary
 # - Can be a string
 # - Can be bytes (within ascii encoding)

 # requests.request(method='get',
 # url='http://127.0.0.1:8000/test/',
 # params={'k1':'v1','k2':'Utility bill'})

 # requests.request(method='get',
 # url='http://127.0.0.1:8000/test/',
 # params="k1=v1&k2=Utility bill&k3=v3&k3=vv3")

 # requests.request(method='get',
 # url='http://127.0.0.1:8000/test/',
 # params=bytes("k1=v1&k2=k2&k3=v3&k3=vv3", encoding='utf8'))

 # error
 # requests.request(method='get',
 # url='http://127.0.0.1:8000/test/',
 # params=bytes("k1=v1&k2=Utility bill&k3=v3&k3=vv3", encoding='utf8'))
 pass

def param_data():
 # Can be a dictionary
 # Can be a string
 # Can be bytes
 # Can be a file object

 # requests.request(method='POST',
 # url='http://127.0.0.1:8000/test/',
 # data={'k1':'v1','k2':'Utility bill'})

 # requests.request(method='POST',
 # url='http://127.0.0.1:8000/test/',
 # data="k1=v1; k2=v2; k3=v3; k3=v4"
    # )

 # requests.request(method='POST',
 # url='http://127.0.0.1:8000/test/',
 # data="k1=v1;k2=v2;k3=v3;k3=v4",
 # headers={'Content-Type':'application/x-www-form-urlencoded'}
    # )

 # requests.request(method='POST',
 # url='http://127.0.0.1:8000/test/',
 # data=open('data_file.py', mode='r', encoding='utf-8'), #The content of the file is: k1=v1;k2=v2;k3=v3;k3=v4
 # headers={'Content-Type':'application/x-www-form-urlencoded'}
    # )
 pass

def param_json():
 # Serialize the corresponding data in json into a string, json.dumps(...)
 # Then sent to the body of the server, and Content-Type is{'Content-Type':'application/json'}
 requests.request(method='POST',
      url='http://127.0.0.1:8000/test/',
      json={'k1':'v1','k2':'Utility bill'})

def param_headers():
 # Send the request header to the server
 requests.request(method='POST',
      url='http://127.0.0.1:8000/test/',
      json={'k1':'v1','k2':'Utility bill'},
      headers={'Content-Type':'application/x-www-form-urlencoded'})

def param_cookies():
 # Send cookies to the server
 requests.request(method='POST',
      url='http://127.0.0.1:8000/test/',
      data={'k1':'v1','k2':'v2'},
      cookies={'cook1':'value1'},)
 # CookieJar can also be used (the dictionary form is encapsulated on this basis)
 from http.cookiejar import CookieJar
 from http.cookiejar import Cookie

 obj =CookieJar()
 obj.set_cookie(Cookie(version=0, name='c1', value='v1', port=None, domain='', path='/', secure=False, expires=None,
       discard=True, comment=None, comment_url=None, rest={'HttpOnly': None}, rfc2109=False,
       port_specified=False, domain_specified=False, domain_initial_dot=False, path_specified=False))
 requests.request(method='POST',
      url='http://127.0.0.1:8000/test/',
      data={'k1':'v1','k2':'v2'},
      cookies=obj)

def param_files():
 # Send File
 # file_dict ={
 # ' f1':open('readme','rb')
    # }
 # requests.request(method='POST',
 # url='http://127.0.0.1:8000/test/',
 # files=file_dict)

 # Send file, customize file name
 # file_dict ={
 # ' f1':('test.txt',open('readme','rb'))
    # }
 # requests.request(method='POST',
 # url='http://127.0.0.1:8000/test/',
 # files=file_dict)

 # Send file, customize file name
 # file_dict ={
 # ' f1':('test.txt',"hahsfaksfa9kasdjflaksdjf")
    # }
 # requests.request(method='POST',
 # url='http://127.0.0.1:8000/test/',
 # files=file_dict)

 # Send file, customize file name
 # file_dict ={
 #  ' f1':('test.txt',"hahsfaksfa9kasdjflaksdjf",'application/text',{'k1':'0'})
    # }
 # requests.request(method='POST',
 #     url='http://127.0.0.1:8000/test/',
 #     files=file_dict)

 pass

def param_auth():from requests.auth import HTTPBasicAuth, HTTPDigestAuth

 ret = requests.get('https://api.github.com/user', auth=HTTPBasicAuth('wupeiqi','sdfasdfasdf'))print(ret.text)

 # ret = requests.get('http://192.168.1.1',
 # auth=HTTPBasicAuth('admin','admin'))
 # ret.encoding ='gbk'
 # print(ret.text)

 # ret = requests.get('http://httpbin.org/digest-auth/auth/user/pass', auth=HTTPDigestAuth('user','pass'))
 # print(ret)
    #

def param_timeout():
 # ret = requests.get('http://google.com/', timeout=1)
 # print(ret)

 # ret = requests.get('http://google.com/', timeout=(5,1))
 # print(ret)
 pass

def param_allow_redirects():
 ret = requests.get('http://127.0.0.1:8000/test/', allow_redirects=False)print(ret.text)

def param_proxies():
 # proxies ={
 # " http":"61.172.249.96:80",
 # " https":"http://61.185.219.126:3128",
    # }

 # proxies ={'http://10.20.1.128':'http://10.10.1.10:5323'}

 # ret = requests.get("http://www.proxy360.cn/Proxy", proxies=proxies)
 # print(ret.headers)

 # from requests.auth import HTTPProxyAuth
    #
 # proxyDict ={
 # ' http':'77.75.105.165',
 # ' https':'77.75.105.165'
    # }
 # auth =HTTPProxyAuth('username','mypassword')
    #
 # r = requests.get("http://www.google.com", proxies=proxyDict, auth=auth)
 # print(r.text)

 pass

def param_stream():
 ret = requests.get('http://127.0.0.1:8000/test/', stream=True)print(ret.content)
 ret.close()

 # from contextlib import closing
 # withclosing(requests.get('http://httpbin.org/get', stream=True))as r:
 # # The response is processed here.
 # for i in r.iter_content():
 # print(i)

def requests_session():import requests

 session = requests.Session()

 ### 1、 First log in to any page to get the cookie

 i1 = session.get(url="http://dig.chouti.com/help/service")

 ### 2、 The user logs in, carries the last cookie, and the background authorizes the gpsd in the cookie
 i2 = session.post(
  url="http://dig.chouti.com/login",
  data={'phone':"8615131255089",'password':"xxxxxx",'oneMonth':""})

 i3 = session.post(
  url="http://dig.chouti.com/link/vote?linksId=8589623",)print(i3.text)

json request:

#! /usr/bin/python3
import requests
import json

classurl_request():
 def __init__(self):''' init '''if __name__ =='__main__':
 heard ={'Content-Type':'application/json'}
 payload ={'CountryName':'China','ProvinceName':'Sichuan Province','L1CityName':'chengdu','L2CityName':'yibing','TownName':'','Longitude':'107.33393','Latitude':'33.157131','Language':'CN'}
 r = requests.post("http://www.xxxxxx.com/CityLocation/json/LBSLocateCity", heards=heard, data=payload)
 data = r.json()if r.status_code!=200:print('LBSLocateCity API Error'+str(r.status_code))print(data['CityEntities'][0]['CityID'])  #Print the value of a key in the returned json
 print(data['ResponseStatus']['Ack'])print(json.dump(data, indent=4, sort_keys=True, ensure_ascii=False))  #Tree print json, ensure_ascii must be set to False otherwise Chinese will be displayed as unicode

Xml request:

#! /usr/bin/python3
import requests

classurl_request():
 def __init__(self):"""init"""if __name__ =='__main__':
 heards ={'Content-type':'text/xml'}
 XML ='<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><soap:Body><Request xmlns="http://tempuri.org/"><jme><JobClassFullName>WeChatJSTicket.JobWS.Job.JobRefreshTicket,WeChatJSTicket.JobWS</JobClassFullName><Action>RUN</Action><Param>1</Param><HostIP>127.0.0.1</HostIP><JobInfo>1</JobInfo><NeedParallel>false</NeedParallel></jme></Request></soap:Body></soap:Envelope>'
 url ='http://jobws.push.mobile.xxxxxxxx.com/RefreshWeiXInTokenJob/RefreshService.asmx'
 r = requests.post(url=url, heards=heards, data=XML)
 data = r.text
 print(data)

State exception handling####

import requests

URL ='http://ip.taobao.com/service/getIpInfo.php'  #Taobao IP address library API
try:
 r = requests.get(URL, params={'ip':'8.8.8.8'}, timeout=1)
 r.raise_for_status()  #If the response status code is not 200, take the initiative to throw an exception
except requests.RequestException as e:print(e)else:
 result = r.json()print(type(result), result, sep='\n')

upload files####

Using the request module, you can also upload files, and the file type will be processed automatically:

import requests
 
url ='http://127.0.0.1:8080/upload'
files ={'file':open('/home/rxf/test.jpg','rb')}
# files ={'file':('report.jpg',open('/home/lyb/sjzl.mpg','rb'))}     #Explicitly set the file name
 
r = requests.post(url, files=files)print(r.text)

Request is more convenient, you can upload the string as a file:

import requests
 
url ='http://127.0.0.1:8080/upload'
files ={'file':('test.txt', b'Hello Requests.')}     #The file name must be set explicitly
 
r = requests.post(url, files=files)print(r.text)

6) Authentication####

Basic authentication (HTTP Basic Auth)

import requests
from requests.auth import HTTPBasicAuth
 
r = requests.get('https://httpbin.org/hidden-basic-auth/user/passwd', auth=HTTPBasicAuth('user','passwd'))
# r = requests.get('https://httpbin.org/hidden-basic-auth/user/passwd', auth=('user','passwd'))    #Shorthand
print(r.json())

Another very popular form of HTTP authentication is digest authentication, and Requests supports it out of the box:

requests.get(URL, auth=HTTPDigestAuth('user','pass')

Cookies and session objects

If a response contains some cookies, you can quickly access them:

import requests
 
r = requests.get('http://www.google.com.hk/')print(r.cookies['NID'])print(tuple(r.cookies))

To send your cookies to the server, you can use the cookies parameter:

import requests
 
url ='http://httpbin.org/cookies'
cookies ={'testCookies_1':'Hello_Python3','testCookies_2':'Hello_Requests'}
# In Cookie Version 0, spaces, square brackets, parentheses, equal signs, commas, double quotes, slashes, question marks,@, Colon, semicolon and other special symbols cannot be used as the content of the cookie.
r = requests.get(url, cookies=cookies)print(r.json())

Session objects allow you to keep certain parameters across requests. The most convenient way is to keep cookies between all requests issued by the same Session instance, and these are handled automatically, which is very convenient.
Here is a real example, the following is the fast disk sign-in script:

import requests
 
headers ={'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8','Accept-Encoding':'gzip, deflate, compress','Accept-Language':'en-us;q=0.5,en;q=0.3','Cache-Control':'max-age=0','Connection':'keep-alive','User-Agent':'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:22.0) Gecko/20100101 Firefox/22.0'}
 
s = requests.Session()
s.headers.update(headers)
# s.auth =('superuser','123')
s.get('https://www.kuaipan.cn/account_login.htm')
 
_ URL ='http://www.kuaipan.cn/index.php'
s.post(_URL, params={'ac':'account','op':'login'},
  data={'username':'****@foxmail.com','userpwd':'********','isajax':'yes'})
r = s.get(_URL, params={'ac':'zone','op':'taskdetail'})print(r.json())
s.get(_URL, params={'ac':'common','op':'usersign'})

The requests module grabs the source code of the webpage and saves it to a file example###

This is a basic file saving operation, but there are a few noteworthy issues:

  1. Install the requests package, enter pip install requests on the command line to install it automatically. Many people recommend using requests, and the built-in urllib.request can also grab the source code of web pages

  2. The encoding parameter of the open method is set to utf-8, otherwise the saved file will appear garbled.

  3. If you output the captured content directly in cmd, various encoding errors will be prompted, so save it to a file for viewing.

  4. The with open method is a better way to write, it can release resources after the operation is completed automatically

#! /urs/bin/python3
import requests

''' The requests module grabs the source code of the webpage and saves it to a file example'''
html = requests.get("http://www.baidu.com")withopen('test.txt','w', encoding='utf-8')as f:
 f.write(html.text)'''Example of reading a txt file, reading one line at a time, and saving to another txt file'''
ff =open('testt.txt','w', encoding='utf-8')withopen('test.txt', encoding="utf-8")as f:for line in f:
  ff.write(line)
  ff.close()

Because printing the data read one line at a time in the command line, there will be coding errors in Chinese, so read one line at a time and save it to another file to test whether the reading is normal. (Pay attention to the encoding method when opening)

Example of "Auto Login":

#! /usr/bin/env python
# - *- coding:utf-8-*-import requests

# ############## method one##############
"""
# ## 1、 First log in to any page to get the cookie
i1 = requests.get(url="http://dig.chouti.com/help/service")
i1_cookies = i1.cookies.get_dict()

# ## 2、 The user logs in, carries the last cookie, and the background authorizes the gpsd in the cookie
i2 = requests.post(
 url="http://dig.chouti.com/login",
 data={'phone':"8615131255089",'password':"xxooxxoo",'oneMonth':""},
 cookies=i1_cookies
)

# ## 3、 Like (just need to bring the authorized gpsd)
gpsd = i1_cookies['gpsd']
i3 = requests.post(
 url="http://dig.chouti.com/link/vote?linksId=8589523",
 cookies={'gpsd': gpsd})print(i3.text)"""

# ############## Way two##############
"""
import requests

session = requests.Session()
i1 = session.get(url="http://dig.chouti.com/help/service")
i2 = session.post(
 url="http://dig.chouti.com/login",
 data={'phone':"8615131255089",'password':"xxooxxoo",'oneMonth':""})
i3 = session.post(
 url="http://dig.chouti.com/link/vote?linksId=8589523")print(i3.text)"""
#! /usr/bin/env python
# - *- coding:utf-8-*-import requests
from bs4 import BeautifulSoup

# ############## method one##############
#
# # 1. Visit the landing page to get authenticity_token
# i1 = requests.get('https://github.com/login')
# soup1 =BeautifulSoup(i1.text, features='lxml')
# tag = soup1.find(name='input', attrs={'name':'authenticity_token'})
# authenticity_token = tag.get('value')
# c1 = i1.cookies.get_dict()
# i1.close()
#
# # 1. Carry authenticity_token, username and password and other information, send user verification
# form_data ={
# " authenticity_token": authenticity_token,
#  " utf8":"",
#  " commit":"Sign in",
#  " login":"[email protected]",
#  ' password':'xxoo'
# }
#
# i2 = requests.post('https://github.com/session', data=form_data, cookies=c1)
# c2 = i2.cookies.get_dict()
# c1.update(c2)
# i3 = requests.get('https://github.com/settings/repositories', cookies=c1)
#
# soup3 =BeautifulSoup(i3.text, features='lxml')
# list_group = soup3.find(name='div', class_='listgroup')
#
# from bs4.element import Tag
#
# for child in list_group.children:
#  ifisinstance(child, Tag):
#   project_tag = child.find(name='a', class_='mr-1')
#   size_tag = child.find(name='small')
#   temp ="project:%s(%s); project路径:%s"%(project_tag.get('href'), size_tag.string, project_tag.string,)
#   print(temp)

# ############## Way two##############
# session = requests.Session()
# # 1. Visit the landing page to get authenticity_token
# i1 = session.get('https://github.com/login')
# soup1 =BeautifulSoup(i1.text, features='lxml')
# tag = soup1.find(name='input', attrs={'name':'authenticity_token'})
# authenticity_token = tag.get('value')
# c1 = i1.cookies.get_dict()
# i1.close()
#
# # 1. Carry authenticity_token, username and password and other information, send user verification
# form_data ={
#  " authenticity_token": authenticity_token,
#  " utf8":"",
#  " commit":"Sign in",
#  " login":"[email protected]",
#  ' password':'xxoo'
# }
#
# i2 = session.post('https://github.com/session', data=form_data)
# c2 = i2.cookies.get_dict()
# c1.update(c2)
# i3 = session.get('https://github.com/settings/repositories')
#
# soup3 =BeautifulSoup(i3.text, features='lxml')
# list_group = soup3.find(name='div', class_='listgroup')
#
# from bs4.element import Tag
#
# for child in list_group.children:
#  ifisinstance(child, Tag):
#   project_tag = child.find(name='a', class_='mr-1')
#   size_tag = child.find(name='small')
#   temp ="project:%s(%s); project路径:%s"%(project_tag.get('href'), size_tag.string, project_tag.string,)
#   print(temp)
#! /usr/bin/env python
# - *- coding:utf-8-*-import time

import requests
from bs4 import BeautifulSoup

session = requests.Session()

i1 = session.get(
 url='https://www.zhihu.com/#signin',
 headers={'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',})

soup1 =BeautifulSoup(i1.text,'lxml')
xsrf_tag = soup1.find(name='input', attrs={'name':'_xsrf'})
xsrf = xsrf_tag.get('value')

current_time = time.time()
i2 = session.get(
 url='https://www.zhihu.com/captcha.gif',
 params={'r': current_time,'type':'login'},
 headers={'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',})withopen('zhihu.gif','wb')as f:
 f.write(i2.content)

captcha =input('Please open zhihu.gif file, view and enter the verification code:')
form_data ={"_xsrf": xsrf,'password':'xxooxxoo',"captcha":'captcha','email':'[email protected]'}
i3 = session.post(
 url='https://www.zhihu.com/login/email',
 data=form_data,
 headers={'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',})

i4 = session.get(
 url='https://www.zhihu.com/settings/profile',
 headers={'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',})

soup4 =BeautifulSoup(i4.text,'lxml')
tag = soup4.find(id='rename-section')
nick_name = tag.find('span',class_='name').string
print(nick_name)
#! /usr/bin/env python
# - *- coding:utf-8-*-import re
import json
import base64

import rsa
import requests

def js_encrypt(text):
 b64der ='MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCp0wHYbg/NOPO3nzMD3dndwS0MccuMeXCHgVlGOoYyFwLdS24Im2e7YyhB0wrUsyYf0/nhzCzBK8ZC9eCWqd0aHbdgOQT6CuFQBMjbyGYvlVYU2ZP7kG9Ft6YV6oc9ambuO7nPZh+bvXH0zDKfi02prknrScAKC0XhadTHT3Al0QIDAQAB'
 der = base64.standard_b64decode(b64der)

 pk = rsa.PublicKey.load_pkcs1_openssl_der(der)
 v1 = rsa.encrypt(bytes(text,'utf8'), pk)
 value = base64.encodebytes(v1).replace(b'\n', b'')
 value = value.decode('utf8')return value

session = requests.Session()

i1 = session.get('https://passport.cnblogs.com/user/signin')
rep = re.compile("'VerificationToken': '(.*)'")
v = re.search(rep, i1.text)
verification_token = v.group(1)

form_data ={'input1':js_encrypt('wptawy'),'input2':js_encrypt('asdfasdf'),'remember': False
}

i2 = session.post(url='https://passport.cnblogs.com/user/signin',
     data=json.dumps(form_data),
     headers={'Content-Type':'application/json; charset=UTF-8','X-Requested-With':'XMLHttpRequest','VerificationToken': verification_token})

i3 = session.get(url='https://i.cnblogs.com/EditDiary.aspx')print(i3.text)
#! /usr/bin/env python
# - *- coding:utf-8-*-import requests

# Step 1: Visit the landing page,Get X_Anti_Forge_Token,X_Anti_Forge_Code
# 1、 Request url:https://passport.lagou.com/login/login.html
# 2、 Request method:GET
# 3、 Request header:
# User-agent
r1 = requests.get('https://passport.lagou.com/login/login.html',
     headers={'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36',},)

X_Anti_Forge_Token = re.findall("X_Anti_Forge_Token = '(.*?)'", r1.text, re.S)[0]
X_Anti_Forge_Code = re.findall("X_Anti_Forge_Code = '(.*?)'", r1.text, re.S)[0]print(X_Anti_Forge_Token, X_Anti_Forge_Code)
# print(r1.cookies.get_dict())
# Step 2: Log in
# 1、 Request url:https://passport.lagou.com/login/login.json
# 2、 Request method:POST
# 3、 Request header:
# cookie
# User-agent
# Referer:https://passport.lagou.com/login/login.html
# X-Anit-Forge-Code:53165984
# X-Anit-Forge-Token:3b6a2f62-80f0-428b-8efb-ef72fc100d78
# X-Requested-With:XMLHttpRequest
# 4、 Request body:
# isValidate:true
# username:15131252215
# password:ab18d270d7126ea65915c50288c22c0d
# request_form_verifyCode:''
# submit:''
r2 = requests.post('https://passport.lagou.com/login/login.json',
 headers={'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36','Referer':'https://passport.lagou.com/login/login.html','X-Anit-Forge-Code': X_Anti_Forge_Code,'X-Anit-Forge-Token': X_Anti_Forge_Token,'X-Requested-With':'XMLHttpRequest'},
 data={"isValidate": True,'username':'15131255089','password':'ab18d270d7126ea65915c50288c22c0d','request_form_verifyCode':'','submit':''},
 cookies=r1.cookies.get_dict())print(r2.text)

reference:

http://cn.python-requests.org/zh_CN/latest/user/quickstart.html#id4

http://www.python-requests.org/en/master/

http://docs.python-requests.org/en/latest/user/quickstart/

https://www.cnblogs.com/tangdongchu/p/4229049.html#t0

http://www.cnblogs.com/wupeiqi/articles/6283017.html

Recommended Posts

Python—requests module detailed explanation
Detailed explanation of python standard library OS module
Detailed explanation of python sequence types
Detailed explanation of ubuntu using gpg2
Python error handling assert detailed explanation
Centos 7 RAID 5 detailed explanation and configuration
Detailed explanation of Python IO port multiplexing
Python from attribute to property detailed explanation
Detailed explanation of -u parameter of python command
Detailed explanation of Python guessing algorithm problems