Python crawls the full set of skins of the king pesticide

Author: toofelix

1. Analyze the website that needs to be crawled

①、Open the official king of glory wallpaper website

② Shortcut key F12, call up the console for packet capture

③、Find the correct link and analyze

④. View the returned data format

⑤, resolve url link

⑥ Check whether the url content is the desired picture, and found that it is actually a thumbnail

⑦, then go to analyze the website, just click on a wallpaper, and view the link in the specified format

⑧、Find the target address###

⑨、Analyze the difference between target link and thumbnail

Two, crawler code#

① At this point, the crawler analysis is complete, and the complete crawler code is as follows###

#! /usr/bin/env python
# encoding: utf-8'''
#     @ Project Name :King of glory wallpaper download
#     @ File Name    :
#     @ Programmer   : Felix
#     @ Start Date   :2020/7/3014:42
#     @ Last Update  :2020/7/3014:42
import os, time, requests, json, re
from retrying import retry
from urllib import parse
  This is a main Class, the file contains all documents.
  One document contains paragraphs that have several sentences
  It loads the original file and converts the original file to newcontent
  Then the newcontent will be saved by thisclass'''
 def __init__(self, save_path='./heros'):
  self.save_path = save_path
  self.time =str(time.time()).split('.')
  self.url ='{}&iOrder=0&iSortNumClose=1&iAMSActivityId=51991&_everyRead=true&iTypeId=2&iFlowId=267733&iActId=2735&iModuleId=2735&_=%s'% self.time[0]
 def hello(self):'''
  This is a welcome speech
  : return: self
  print("*"*50)print(' '*18+'King of glory wallpaper download')print(' '*5+'Author: Felix  Date: 2020-05-20 13:14')print("*"*50)return self
 def run(self):'''
  The program entry
  print('↓'*20+'Format selection: '+'↓'*20)print('1.Thumbnail 2.1024x768 3.1280x720 4.1280x1024 5.1440x900 6.1920x1080 7.1920x1200 8.1920x1440')
  size =input('Please enter the serial number of the format you want to download, the default is 6:')
  size = size if size and int(size)in[1,2,3,4,5,6,7,8]else6print('---Download starts...')
  page =0
  offset =0
  total_response = self.request(self.url.format(page)).text
  total_res = json.loads(total_response)
  total_page =--int(total_res['iTotalPages'])print('---In total{}page...'.format(total_page))while True:if offset > total_page:break
   url = self.url.format(offset)
   response = self.request(url).text
   result = json.loads(response)
   now =0for item in result["List"]:
    now +=1
    hero_name = parse.unquote(item['sProdName']).split('-')[0]
    hero_name = re.sub(r'[【】:.<>|·@#$%^&() ]','', hero_name)print('---Downloading{}page{}Hero progress{}/{}...'.format(offset, hero_name, now,len(result["List"])))
    hero_url = parse.unquote(item['sProdImgNo_{}'.format(str(size))])
    save_path = self.save_path +'/'+ hero_name
    save_name = save_path +'/'+ hero_url.split('/')[-2]if not os.path.exists(save_path):
     os.makedirs(save_path)if not os.path.exists(save_name):withopen(save_name,'wb')as f:
      response_content = self.request(hero_url.replace("/200","/0")).content
   offset +=1print('---Download completed...')
 @ retry(stop_max_attempt_number=3)
 def request(self, url):'''
  Send a request
  : param url: the url of request
  : param timeout: the time of request
  : return: the result of request
  response = requests.get(url, timeout=10)
  assert response.status_code ==200return response
if __name__ =="__main__":HonorOfKings().hello().run()

②, detailed analysis link

self.url ='{}&iOrder=0&iSortNumClose=1&iAMSActivityId=51991&_everyRead=true&iTypeId=2&iFlowId=267733&iActId=2735&iModuleId=2735&_=%s'% self.time[0]

③、Format selection###

print('↓'*20+'Format selection: '+'↓'*20)print('1.Thumbnail 2.1024x768 3.1280x720 4.1280x1024 5.1440x900 6.1920x1080 7.1920x1200 8.1920x1440')
size =input('Please enter the serial number of the format you want to download, the default is 6:')
size = size if size and int(size)in[1,2,3,4,5,6,7,8]else6

④, download code analysis

print('---Download starts...')
page =0
offset =0
total_response = self.request(self.url.format(page)).text
total_res = json.loads(total_response)
total_page =--int(total_res['iTotalPages'])print('---In total{}page...'.format(total_page))while True:if offset > total_page:break
 url = self.url.format(offset)
 response = self.request(url).text
 result = json.loads(response)
 now =0for item in result["List"]:
  now +=1
  hero_name = parse.unquote(item['sProdName']).split('-')[0]
  hero_name = re.sub(r'[【】:.<>|·@#$%^&() ]','', hero_name)print('---Downloading{}page{}Hero progress{}/{}...'.format(offset, hero_name, now,len(result["List"])))
  hero_url = parse.unquote(item['sProdImgNo_{}'.format(str(size))])
  save_path = self.save_path +'/'+ hero_name
  save_name = save_path +'/'+ hero_url.split('/')[-2]if not os.path.exists(save_path):
   os.makedirs(save_path)if not os.path.exists(save_name):withopen(save_name,'wb')as f:
    response_content = self.request(hero_url.replace("/200","/0")).content
 offset +=1print('---Download completed...')

⑤, the results of the crawler running, put the same name in the same folder

< END >

