python3正则表达式:匹配以“www”起始且以“.com”结尾的简单web域名;例如,www.yahoo.com/。
import re patt = 'www.+.com' #正则表达式的意思为www开头,中间.+表示匹配任意长度的任意字符,其中.com为转义.后以.com结尾。
m = re.match(patt,'www.yahoo.com')
if m is not None:m.group()
import re patt = 'www' m = re.match(patt, 'www.yahoo.com') print(m)
判断字符串是否全部为小写,给定字符串:s1 = 'adkkdk'
s2 = 'abc123efg'
In[2]: import re
In[3]: s1 = 'adkkdk'
In[4]: s2 = 'abc123efg'
In[6]: an = re.search('^[a-z]+$',s1)
In[7]: if an:
...: print('s1:',an.group(),'全为小写')
...: else:
...: print(s1,'不全是小写!')
...:
s1: adkkdk 全为小写
In[8]: an = re.match('[a-z]+$',s2)
In[9]: if an:
...: print('s2:',an.group(),'全为小写')
...: else:
...: print(s2,"不全是小写")
...:
abc123efg 不全是小写
在处理自然语言时123,000,000如果以标点符号分割,就会出现大问题,好好的一个数字就被逗号肢解了,因此可以先下手把数字处理干净(逗号去掉)。给定字符串sen = "abc,123,456,789,mnp"
In[2]: import re
In[3]: sen = "abc,123,456,789,mnp"
In[4]: p = re.compile("d+,d+?")
In[5]: for com in p.finditer(sen):
...: mm = com.group()
...: print("hi:",mm)
...: print("sen_before:",sen)
...: sen = sen.replace(mm,mm.replace(",",""))
...: print("sen_back:",sen,'
')
...:
hi: 123,4
sen_before: abc,123,456,789,mnp
sen_back: abc,123456,789,mnp
hi: 56,7
sen_before: abc,123456,789,mnp
sen_back: abc,123456789,mnp