正则表达式功能强大,但学习起来比较费事,今天来探讨一下match,search,findall, sub匹配规则
1.match 只匹配开头的部分并且只匹配一次,否则返回为None
匹配全部:
import re reg = r'w+@(w+).com' pat = re.match(reg,'hello@qq.com hello@qq.com') print(pat.group(1))
匹配分组里面的:
import re reg = r'w+@(w+).com' pat = re.match(reg,'hello@qq.com hello@qq.com') print(pat.group())
>>>hello@qq.com
非开头模式则不匹配:匹配模式从@开始,而内容从hello开始的则匹配不上
import re reg = r'@(w+).com' pat = re.match(reg,'hello@qq.com hello@qq.com') print(pat)
>>>None
2.search和match差不多,只匹配一次,但不限于开头
非从头匹配,支持分组匹配
import re reg = r'@(w+).com' pat = re.search(reg,'hello@qq.com hello@qq.com') print(pat.group(1))
3.findall 匹配一切可以匹配的支持分组匹配,没有group属性,返回是一个列表,但是有括号的只匹配括号里面的。
只匹配括号内容
import re reg = r'w+@(w+).com' pat = re.findall(reg,'hello@qq.com hello@qq.com') print(pat)
>>>['qq', 'qq']
匹配全部,不加括号
import re reg = r'w+@w+.com' pat = re.findall(reg,'hello@qq.com hello@qq.com') print(pat)
>>>['hello@qq.com', 'hello@qq.com']
4.sub用于替换所有匹配的信息,subn加上的次数
import re reg = 'hello' #匹配对象 sub = 'welcome to'#替换对象 pat = re.sub(reg, sub,'hello world, hello C') print(pat)
>>>welcome to world, welcome to C
import re reg = 'hello' sub = 'welcome to' pat = re.subn(reg, sub,'hello world, hello C, hello java') print(pat)
>>>('welcome to world, welcome to C, welcome to java', 3)#返回的是一个元组,最后加次数
5.扩展符号(?i), (?m), (?s)的用法
(?i) = re.I/IGNORECASE (?M) = re.M/MULTILINE (?s) = re.S/DOTALL
是啥意思来,就是忽略大小写模式,多行匹配模式,能够使之匹配换行符,增强.+号的功力。此外i m s 可以混合使用。
(?i) = re.I/IGNORECASE 忽略大小写
import re reg = r'ming' pat = re.findall(reg,'Hello ming, Ming, MING?', re.I) print(pat)
>>>['ming', 'Ming', 'MING']
import re
reg = r'(?i)ming'
pat = re.findall(reg,'Hello ming, Ming, MING?')
print(pat)
>>>['ming', 'Ming', 'MING']
(?M) = re.M/MULTILINE 匹配换行
import re reg = r'(?im)ming' s = '''Hello ming Hello Ming Hello MING ''' pat = re.findall(reg,s) print(pat)
>>>['ming', 'Ming', 'MING']
(?s) = re.S/DOTALL 匹配所有
import re reg = r'(?ims).+' s = '''Hello ming Hello Ming Hello MING ''' pat = re.findall(reg,s) print(pat)
>>>['Hello ming Hello Ming Hello MING ']