The art painting of "Ling Cage" was produced in the first place, and it was broadcast exclusively on station B. The production in the national comics is excellent, but the plot has too many flaws. Comments are polarized, with good ones saying very good and bad ones saying very bad. See what the barrage says
Ideas
Libraries used
Code ideas come from python learners at station B
# Crawl data
import csv
# Data request library
import requests
# Regular expression
import re
# Participle
import jieba
# Word cloud
import wordcloud
# 1. Location url
url='https://api.bilibili.com/x/v2/dm/history?type=1&oid=129528808&date=2020-08-28'
# 2. Simulated login
# Set the h request header to prevent anti-picking interception
# Because station b is to view the bullet screen after logging in, so you need to add your own computer cookie here
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:80.0) Gecko/20100101 Firefox/80.0',"Cookie":"_uuid=445F64D3-1530-41CF-09EC-C6029EF29EA659147infoc; buvid3=54281361-1A51-46A7-838B-7FB1214C81B953936infoc; CURRENT_FNVAL=16; LIVE_BUVID=AUTO9915845422486102; rpdid=|(u)~mYY~u0J'ul)RlRkkR); sid=4y1wx1oi; DedeUserID=229593267; DedeUserID__ckMd5=72ee797eb51fb8c3; SESSDATA=b7620543%2C1600240037%2Cd737d*31; bili_jct=03269466eb702a213723a0585db59cbe; bp_t_offset_229593267=428995725967649604; CURRENT_QUALITY=80; PVID=1; _ga=GA1.2.1605929815.1586006097; bp_video_offset_229593267=428995725967649604; blackside_state=1; bfe_id=fdfaf33a01b88dd4692ca80f00c2de7f"}
# Request data
resp = requests.get(url,headers=headers)
# Decode data to prevent garbled codes
html=resp.content.decode('utf-8')
# 3. Parse webpage to extract subtitles
# Extract the barrage in the returned html through regular expressions
res=re.compile('<d.*?>(.*?)</d>')
danmu=re.findall(res,html)
# 4. save data
for i in danmu:withopen(r'D:\360MoveData\Users\cmusunqi\Documents\GitHub\R_and_python\python\Word cloud and crawler\Barrage.csv','a',newline='',encoding='utf-8')as f:
writer=csv.writer(f)
danmu=[]
danmu.append(i)
writer.writerow(danmu)
# Word cloud drawing============================================================
# Read the saved csv file
f =open(r'D:\360MoveData\Users\cmusunqi\Documents\GitHub\R_and_python\python\Word cloud and crawler\Barrage.csv',encoding='utf-8')
txt=f.read()
# jieba participle
txt_list=jieba.lcut(txt)
# Connect the segmented list with spaces
string=' '.join(txt_list)
# To draw a word cloud, check the official code for the built-in parameters
w=wordcloud.WordCloud(
width=1000,
height=700,
background_color='white',
font_path="msyh.ttc",
scale=15,
stopwords={" "},
contour_width=5,
contour_color='red')
# Export picture as png
w.generate(string)
w.to_file(r'D:\360MoveData\Users\cmusunqi\Documents\GitHub\R_and_python\python\Word cloud and crawler\ciyun.png')
The biggest word in the drawn word cloud is fear and being caught off guard. If you are afraid, look at P Yo.
love&peace
Recommended Posts