元字符有自己的特殊含义
grep(pattern = "[wW]", x = states, value = T)
grep(pattern = "w", ignore.case = T, x = states, value = T)
strsplit("strsplit.also.uses", split = ".")
strsplit("strsplit.also.uses", split = "\.")
str_extract_all("me credit card: 334", pattern = "\d")
^
匹配字符串的开头,将^置于character class 的首位表达的意思是取反义。如[ˆ5] 表示匹配除了“5” 以外的所有字符。
test_vector <- c("123","456","321")
str_extract_all(test_vector, "3")
str_extract_all(test_vector, "^3")
str_extract_all(test_vector, "[^3]")
$
匹配字符串的结尾。但将它置于character class 内则消除了它的特殊含义。如 [akm$]
将匹配 a
, k
, m
或者 $
。
str_extract_all(test_vector, "3$")
str_extract_all(test_vector, "[3$]")
str_extract_all(string = c("regular.exp
","
"), pattern =".")
str_extract_all(string = "we23", pattern ="b|w|3")
?
此符号前的字符(组) 是可有可无的,并且最多被匹配一次
str_extract_all(string = c("abc","bc","ac"),pattern = "ab?c")
( )
表示一个字符组,括号内的字符串将作为一个整体被匹配
str_extract_all(string = c("abc","ac","cde"),pattern = "(ab)c")
str_extract_all(string = c("abab","abc","ac"),pattern = "(ab)*")
str_extract_all(string = c("abbab","abc","ac"),pattern = "ab+")
str_extract_all(string = c("abababab","ababc","abc"),pattern = "(ab){2}")
str_extract_all(string = c("abababab","ababc","abc"),pattern = "(ab){2,}")
str_extract_all(string = c("abababab","ababc","abc"),pattern = "(ab){2,3}")