Regular expressions are a powerful tool for processing strings. As a concept, regular expressions are not unique to Python. However, regular expressions in Python still have some subtle differences in actual use.
(1) Match the number between 1-100
import re
s ='100' # 1-Any number within 100
ret = re.match(r'(100|[1-9]\d{0,1})$',s)print(ret.group())
(100|[1- 9]\ d{0,1})$
100 Can match 100 | or match a number in [1-9], and then \d is a number, and the following {0,1} matches at most one number or no number
[1- 9]\ d means that it can only be any number with 1-9 in the front, mainly excluding 0, otherwise 01 is not allowed, and 0 is included in the back.
(2) Match landline number
010- 67132692 , Its construction rule is [3 digits][-][8 digits]
or
0516- 8978981 , Its construction rule is [4 digits][-][7 digits]
import re
s ="010-67132692"
ret = re.search(r'^\d{3,4}-\d{7,8}$', s)print(ret.group())
Note: print(ret.group(0)) has the same effect, python can be 0 by default, and it can be obtained without (). Generally, php and js start with \1
**(3) Match the entered qq number (qq matching rule: length is 5-10 digits, composed of pure numbers, and cannot start with 0.) **
import re
s ="1101111123"
ret = re.match(r'[1-9]\d{4,9}$', s)if ret != None:print(ret.group())else:print('Match failed!')
(4) Find how many af in the string
import re
s ="asdfjvjadsffvaadfkfasaffdsasdffadsafafsafdadsfaafd"
ret = re.findall(r'(af)', s)print(len(ret))
(5) The rule is to cut one or more times according to the space
import re
s ="zhangsan lisi wangwu"
res = re.compile(r'\s+')
ret = res.split(s)print(ret)
Effect picture:
(6) Use regular\cut
import re
s ="c:\abc\a.txt"
res = re.compile(r'\')
ret = res.split(s)print(ret)
Effect picture:
(7) Replace more than 5 consecutive numbers with #
import re
s ="wer8934605juo123wa89320571f"
res = re.compile(r'\d{5,}')
ret = res.sub('#', s)print(ret)
Effect picture:
(8) Take out all letters in the string
import re
s ="abDEe23dJfd343dPOddfe4CdD5ccv!23rr"
res = re.compile(r'[a-zA-Z]+')
ret = res.findall(s)print(ret)
Effect picture:
**(9) Find words ending in the letter e, ignoring case **
import re
s ='THREE people at HERE do some THING'
res = re.compile(r'\w+e\b', re.I) #\b is the boundary
ret = res.findall(s)print(ret)
Effect picture:
(10) Replace multiple repeated letters with &
import re
s ="cudddbhuuujdddcaa"
res = re.compile(r'([a-zA-Z])+')
ret = res.sub('&', s)print(ret)
Effect picture:
(11) Replace multiple repeated letters with one letter (for example, replace ddd with d)
import re
s ="cudddbhuuujddd"
res = re.compile(r'([a-zA-Z])+')
ret = res.sub(r'',s)print(ret)
Effect picture:
(12) Get words with a length of 3 letters
import re
s ="min tian jiu yao fang jia le ,da jia"
ret = re.findall(r'\b\w{3}\b', s)print(ret)
Effect picture:
(13) Turn the string into'I want to learn programming'
import re
s ="me...me...I need to..Want to...Want to...Learn to learn...study...Editing..Programming..Cheng.Cheng...Cheng...Cheng"
res = re.sub(r'\W+','', s)
ret = re.sub(r'(.)+',r'',res)print(ret)
Effect picture:
(14) Remove the div and b tags
Result: regularimport re
s ="<div class='a'Regular<span expression</span <b style='color:red'Exercise</b </div "
ret = re.sub(r'(</?div.*? |</?b.*? )','',s)print(ret)
Effect picture:
(15) Find strings with only 3 numbers in each line
import re
s ='''121fefe
3 qsqse2
ded6d32
aaaaa1a
1234 adc
'''
ret = re.findall(r'^\D*\d\D*\d\D*\d\D*$', s ,re.M)print(ret)
Effect picture:
The following is a supplement
Collect some commonly used python regular exercises
# Match out 0-Number between 99
print("---Match out 0-Number between 99---")
ret = re.match(r"^[1-9]?[0-9]$","77")print(ret.group())
# 8 Passwords up to 20 digits, can be small letters, numbers, underscores
print("---, 8 to 20 digits password, can be large and small English characters, numbers, underscores---")
ret = re.match("[\w_]{8,20}","1123dasf1")print(ret.group())
# Matches the email address of 163, and@There are 4 to 20 bits before the symbol, such as [email protected]
print("---Matches the email address of 163, and@There are 4 to 20 bits before the symbol, such as [email protected]")
ret = re.match("[\w_]{4,20}@163\.com","[email protected]")print(ret.group())print("---b---")
ret = re.match(r".*\b163\b","[email protected]")print(ret.group())
# Match 1-Number between 100
print("---Match 1-Number between 100---")
ret = re.match("[1-9]?\d$|100","100")print(ret.group())
# Match 163, 126, qq mailbox
print("---Match 163, 126, qq mailbox---")
ret = re.match("[\w_]{4,20}@(163|126|qq)\.com","[email protected]")print(ret.group())
# match<html hello world</html
print("---match<html hello world</html ---")
ret = re.match(r"<([a-zA-Z]*) .*</\1 ","<html hello world</html ")print(ret.group())
# The first:Match out<html <h1 www.itcast.cn</h1 </html
print("---The first:Match out<html <h1 www.qblank.cn</h1 </html ---")
ret = re.match(r"<(\w*) <(\w*) .*</\2 </\1 ","<html <h1 www.itcast.cn</h1 </html ")print(ret.group())
# The second:Match out<html <h1 www.qblank.cn</h1 </html
print("---The second:Match out<html <h1 www.qblank.cn</h1 </html ")
ret = re.match("<(?P<name1 \w*) <(?P<name2 \w*) .*</(?P=name2) </(?P=name1) ","<html <h1 www.qblank.cn</h1 </html ")print(ret.group())
# ******Advanced usage of re module*****
# Use search to match the number of readings of the article
print("---Number of reads of matching article---")
ret = re.search(r"\d+","The number of reads is 9999")print(ret.group())
# Statistics out python, c, c++The number of times the corresponding chapter was read
print("---Statistics out python, c, c++The number of times the corresponding chapter was read---")
ret = re.findall(r"\d+","python = 2342,c = 7980,java = 9999")print(ret)
# Add 1print to the number of matched readings("---Add 1 to the number of matched readings---")
ret = re.sub(r"\d+","999","python = 997")print(ret)
# < div
# < p Job responsibilities:</p
# < p Complete server-side related tasks such as recommendation algorithm, data statistics, connection, and background</p
# < p <br </p <p Required requirements:</p <p Good self-driven and professional quality, be proactive and result-oriented</p
# < p <br </p <p Technical requirements:</p
# < p 1. More than one year of Python development experience, master face-oriented object analysis and design, and understand design patterns</p
# < p 2. Master HTTP protocol, familiar with concepts such as MVC, MVVM and related WEB development frameworks</p
# < p 3. Master the development and design of relational database, master SQL, and use MySQL proficiently/One of PostgreSQL<br </p
# < p 4. Master NoSQL and MQ, and be proficient in using corresponding technologies to solve solutions</p
# < p 5. Familiar with Javascript/CSS/HTML5,JQuery、React、Vue.js</p
# < p <br </p <p bonus items:</p
# < p Large data, mathematical statistics, machine learning, sklearn, high performance, high concurrency.</p
# < /div
data ="""
< div
< p Job responsibilities:</p
< p Complete server-side related tasks such as recommendation algorithm, data statistics, connection, and background</p
< p <br </p <p Required requirements:</p <p Good self-driven and professional quality, be proactive and result-oriented</p <p <br </p <p Technical requirements:</p
< p 1. More than one year of Python development experience, master face-oriented object analysis and design, and understand design patterns</p
< p 2. Master HTTP protocol, familiar with concepts such as MVC, MVVM and related WEB development frameworks</p
< p 3. Master the development and design of relational database, master SQL, and use MySQL proficiently/One of PostgreSQL<br </p
< p 4. Master NoSQL and MQ, and be proficient in using corresponding technologies to solve solutions</p
< p 5. Familiar with Javascript/CSS/HTML5,JQuery、React、Vue.js</p
< p <br </p <p bonus items:</p
< p Large data, mathematical statistics, machine learning, sklearn, high performance, high concurrency.</p
< /div
"""
print("---Crawl employment information URL---")
# method one:Turn off greedy mode
print("---method one---")
ret = re.sub(r"<.+? ","",data)print(ret)
# Method Two:print("---Method Two---")
ret = re.sub(r"</?\w+ ","",data)print(ret)
# Cutting the string "info:xiaoZhang 33 shandong”
print("---Cutting the string "info:xiaoZhang 33 shandong”---")
ret = re.split(r":|","Cutting string info:xiaoZhang 33 shandong")print(ret)
# This is a number 234-235-22-423
data ="This is a number 234-235-22-423"print("---Greedy and non-greedy---")
# greedy
ret = re.match(".+(\d+-\d+-\d+-\d+)",data)print(ret.group(1))
# Non-greedy
ret = re.match(".+?(\d+-\d+-\d+-\d+)",data)print(ret.group(1))
# Extract the url of the picture
data ="""
< img data-original="https://rpic.douyucdn.cn/appCovers/2016/11/13/1213973_201611131917_small.jpg"
src="https://rpic.douyuc dn.cn/appCovers/2016/11/13/1213973_201611131917_small.jpg"
style="display:inline;""""
print("---Extract the url of the picture")
ret = re.search(r"https.+?\.jpg",data)print(ret.group())
data ="""
http://www.interoem.com/messageinfo.asp?id=35
http://3995503.com/class/class09/news_show.asp?id=14
http://lib.wzmc.edu.cn/news/onews.asp?id=769
http://www.zy-ls.com/alfx.asp?newsid=377&id=6
http://www.fincm.com/newslist.asp?id=415"""
# Remove the suffix
print("---Remove the suffix---")
ret = re.sub(r"(http://.+?/).*", lambda x: x.group(1),data)print(ret)
# Find all words
data ="hello world ha ha"print("---Find all words---")print("--method one--")
ret = re.split(r" +",data)print(ret)print("--Method Two--")
ret = re.findall(r"\b[a-zA-Z]+\b",data)print(ret)
So far, this article on Python regular expression learning small examples is introduced. For more related Python regular expression learning examples, please search for ZaLou.Cn's previous articles or continue to browse the related articles below. I hope you will support ZaLou more in the future. Cn!
Recommended Posts