Python笔记_第四篇_高阶编程_正则表达式_3.正则表达式深入

zoukankan html css js c++ java

Python笔记_第四篇_高阶编程_正则表达式_3.正则表达式深入
1. re.split

　　正则的字符串切割
str1 = "Thomas is a good man" print(re.split(r" +",str1)) # " +" ：至少一个是空格 # ['Thomas', 'is', 'a', 'good', 'man']
2. finditer函数：

　　原型：re.findinter(pattern,string,flags=0)

　　参数：

　　　　pattern：要匹配的正则表达式

　　　　strings：要匹配的字符串

　　　　flags：标志位：用于控制正则表达式的匹配方式，是对pattern的一种辅助，如下：

　　　　　　　　re.I：忽略大小写

　　　　　　　　re.L：做本地化识别

　　　　　　　　re.M：多行匹配，影响^和$

　　　　　　　　re.S：使.(点)匹配包括换行符在内的所有字符

　　　　　　　　re.U：根据Unicode字符集解析字符，影响w W B

　　　　　　　　re.X：以更灵活的格式理解正则表达式

　　功能：与findall类似，扫描整个字符串，返回的是一个迭代器。
str2 = "Thomas is a good man! Thomas is a nice man! Thomas is a handsome man" d = re.finditer(r"(Thomas)",str2) while True: try: l = next(d) print(l) except StopIteration as e: break # <re.Match object; span=(0, 6), match='Thomas'> # <re.Match object; span=(22, 28), match='Thomas'> # <re.Match object; span=(44, 50), match='Thomas'>
3. re.sub() / re.subn()：

原型：
sub(pattern,reple,string,count=0,flags=0)
subn(pattern,reple,string,count=0,flags=0)
参数：
pattern：匹配的正则表达式
reple：指定的用来替换的字符串
string：目标字符串
count：最多替换次数
flags：标志位，用于控制正则表达式的匹配方式，是对pattern的一种辅助，值如下：
re.I：忽略大小写
re.L：做本地化识别的
re.M：多行匹配，影响^和$
re.S：使.匹配包括换行符在内的所有字符
re.U：根据Unicode字符集解析字符，影响w W B
re.X：以更灵活的格式理解正则表达式
功能：在目标字符串中，以正则表达式的规则匹配字符串，再把他们替换成指定的字符串。可以指定替换的次数，如果不指定，它会替换所有的匹配字符串。
区别：前者返回一个呗替换的字符串，后者返回一个字符串，第一个元素呗替换的字符串，第二个元素表示被替换的次数。
# sub替换 str4 = "Thomas is a good good good man" res = re.sub(r"(good)","nice",str4) print(res) print(type(res)) # Thomas is a nice nice nice man # <class 'str'> #指定匹配次数 str5 = "Thomas is a good good good man" res1 = re.sub(r"(good)","nice",str5,count=2) print(res1) print(type(res1)) # Thomas is a nice nice good man # <class 'str'> #subn替换 str6 = "Thomas is a good good good man" res2 = re.subn(r"(good)","nice",str6) print(res2) print(type(res2)) # ('Thomas is a nice nice nice man', 3) # <class 'tuple'>
4. 分组：

　　除了简单的判断是否匹配之外，正则表达式还有提取子串的功能。用()来表示分组，整个分组是提取出来的分组。

　　group()：表示按第几个位置提取

　　groups()：表示提取全部
# 查看组信息 str7 = "010-53247654" m = re.match(r"(d{3})-(d{8})",str7) # 使用序号获取对应组的信息，group(0)一直代表的原始字符串 print(m.group(0)) print(m.group(1)) print(m.group(2)) # 010-53247654 # 010 # 5324765 # 查看匹配的各组的情况，从外头一组一组的显示 m1 = re.match(r"((d{3})-(d{8}))",str7) print(m1.groups()) # ('010-53247654', '010', '53247654') # 给组起名 m2 = re.match(r"(?P<first>d{3})-(?P<second>d{8})",str7) print(m2.group("first")) print(m2.group("second")) # 010 # 53247654
　　备注：另外我们可以看到我们还可以通过?P<>的方式对分组的部分进行编号。

5. 编译：

　　当我们正在使用正则表达式时，re模块会干两件事：

　　第一：编译正则表达式，如果正则表达式本身不合法，会报错。

　　第二：编译后的正则表达式去匹配对象。

　　re.compile(pattern,flags=0)

　　pattern:表示要编译的正则表达式

　　flags：同上
pat = r"^1(([34578]d)|(47))d{8}$" print(re.match(pat,"13600000000")) re_telephon = re.compile(pat) print(re_telephon.match("13600000000")) # <re.Match object; span=(0, 11), match='13600000000'> # <re.Match object; span=(0, 11), match='13600000000'>
查看全文

相关阅读:
四种数据库随机获取10条数据的方法
 古诗词
 一份 Spring Boot 项目搭建模板
 2020年只剩两个月，今年你是怎么过的？
关于使用LocalDateTime进行存储，时间相差比较多的问题。
项目中常用的19条MySQL优化
 SpringBoot注解大全
 JDK8的LocalDateTime用法
 linux代理上网5分钟搞定
 SQL简单语句作用

原文地址：https://www.cnblogs.com/noah0532/p/10907000.html