正则表达式是对字符串操作的一种逻辑公式,就是用事先定义好的一些特定字符、及这些特定字符的组合,组成一个“规则字符串”,这个“规则字符串”用来表达对字符串的一种过滤逻辑。适当使用正则表达式可以提高工作效率。正则表达式帮助文档链接: https://pan.baidu.com/s/1Sws9HBQSR4XSJQZ1dm9G0w 密码: 178u
我们使用的regular_express.txt如下所示
"Open Source" is a good mechanism to develop programs. apple is my favorite food. Football game is not use feet only. this dress doesn't fit me. However, this dress is about $ 3183 dollars. GNU is free air not free beer. Her hair is very beauty. I can't finish the test. Oh! The soup taste good. motorcycle is cheap than car. This window is clear. the symbol '*' is represented as start. Oh! My god! The gd software is a library for drafting programs. You are the best is mean you are the no. 1. The world <Happy> is the same with "glad". I like dog. google is the best tools for search keyword. goooooogle yes! go! go! Let's go. # I am VBird
用cat -An regular_express.txt如下(至于为什么有^M$请转至windows与linux换行规则):
vbird@Ubuntu1604:~$ cat -An regular_express.txt.bak 1 "Open Source" is a good mechanism to develop programs.$ 2 apple is my favorite food.$ 3 Football game is not use feet only.$ 4 this dress doesn't fit me.$ 5 However, this dress is about $ 3183 dollars.^M$ 6 GNU is free air not free beer.^M$ 7 Her hair is very beauty.^M$ 8 I can't finish the test.^M$ 9 Oh! The soup taste good.^M$ 10 motorcycle is cheap than car.$ 11 This window is clear.$ 12 the symbol '*' is represented as start.$ 13 Oh!^IMy god!$ 14 The gd software is a library for drafting programs.^M$ 15 You are the best is mean you are the no. 1.$ 16 The world <Happy> is the same with "glad".$ 17 I like dog.$ 18 google is the best tools for search keyword.$ 19 goooooogle yes!$ 20 go! go! Let's go.$ 21 # I am VBird$ 22 $
基础正则表示法
普通字符:字母、数字、汉字、下划线、以及后边章节中没有特殊定义的标点符号,都是"普通字符"。表达式中的普通字符,在匹配一个字符串的时候,匹配与之相同的一个字符。见例1
简单的转义字符:一些不便书写的字符,采用在前面加 "" 的方法。还有其他一些在后边章节中有特殊用处的标点符号,在前面加 "" 后,就代表该符号本身。比如:^, $ 都有特殊意义,如果要想匹配字符串中 "^" 和 "$" 字符,则表达式就需要写成 "^" 和 "$"。
表达式 | 可匹配 | 表达式 | 可匹配 |
, | 回车或换行符 | ^ | 可匹配^本身 |
制表符 | $ | 可匹配$本身 | |
\ | 代表本身 | . | 可匹配.本身 |
d | 匹配1个数字字符,等于[0-9] | D | 匹配1个非数字字符,等于[^0-9] |
w | 匹配包括下划线的任何单词字符。等价于“[A-Za-z0-9_] ” |
W | 匹配任何非单词字符和下划线。等价于“[^A-Za-z0-9_] ” |
自定义能够匹配"多种字符"的表达式: 使用方括号 [ ] 包含一系列字符,能够匹配其中任意一个字符。用 [^ ] 包含一系列字符,则能够匹配其中字符之外的任意一个字符。同样的道理,虽然可以匹配其中任意一个,但是只能是一个,不是多个。见例2、例3、例4、例5
表达式 | 可匹配 |
[abcd] | 可匹配adcd中的任何1个,有且只有1个 |
[^abc] | 可匹配不是abc的任意字符 |
[f-k] | 可匹配"f"~"k"中任意字符 |
[^A-F0-5] | 可匹配"A"~"F"、"0"~"5"之外的任何字符 |
x|y | 匹配x或y。例如,“z|food ”能匹配“z ”或“food ”。“(z|f)ood ”则匹配“zood ”或“food ” |
代表抽象意义的特殊符号:见例6.
表达式 | 作用 |
^ | 行首,不匹配任何字符(例6、例7) |
$ | 行尾,不匹配任何字符(例7) |
. | 小数点可以匹配除了换行符( )以外的任意一个字符,有且只有一个(例8) |
* | 重复前一个字符 0 到无穷多次(例9) |
修饰匹配次数的特殊符号:
表达式 | 作用 |
{n} | 表达式重复n次,"A{2}",相当于"AA"(例10) |
{m,n} | 表达式重复m~n次,"AB{2,4}",相当于"ABB"、"ABBB"、"ABBBB"(例12) |
{n,} | 表达式至少重复n次,"AB{2,}",相当于"ABB"、"ABBB".....(例11) |
+ | 重复1个或1个以上的前一个RE字符,"go+d",相当于"god"、"good"、"goood"。等于"go{1,}d" |
? | 重复0个或1个前一个RE字符,"g?d",相当于"gd"、"god"。等于"go{0,1}d" |
例1.
vbird@Ubuntu1604:~$ grep -n "the" regular_express.txt 8:I can't finish the test. 12:the symbol '*' is represented as start. 15:You are the best is mean you are the no. 1. 16:The world <Happy> is the same with "glad". 18:google is the best tools for search keyword.
例2.
vbird@Ubuntu1604:~$ grep -n "t[ae]st" regular_express.txt 8:I can't finish the test. 9:Oh! The soup taste good.
例3.
vbird@Ubuntu1604:~$ grep -n "[^g]oo" regular_express.txt 2:apple is my favorite food. 3:Football game is not use feet only. 18:google is the best tools for search keyword. //google不属于匹配到,但是可以匹配到tools,grep是以行为处理单位. 19:goooooogle yes! //goooooogle中有多oo,oo前面有o,所以是可以匹配上的
例4.
vbird@Ubuntu1604:~$ grep -n "[^a-z]oo" regular_express.txt //查找"oo"前不是小写字符的字符串. 3:Football game is not use feet only.
例5.
vbird@Ubuntu1604:~$ grep -n "[0-9]" regular_express.txt 5:However, this dress is about $ 3183 dollars. 15:You are the best is mean you are the no. 1.
例6.
vbird@Ubuntu1604:~$ grep -n "^[^a-zA-Z0-9]" regular_express.txt //[]外的^代表行首意思,[]内的^代表反向选择,这个表达式意思是查找行首既不是字母也不是数字的字符串 1:"Open Source" is a good mechanism to develop programs. 21:# I am VBird
例7.
vbird@Ubuntu1604:~$ grep -n "^$" regular_express.txt //找出空白行 22:
例8.
vbird@Ubuntu1604:~$ grep -n "g..d" regular_express.txt 1:"Open Source" is a good mechanism to develop programs. 9:Oh! The soup taste good. 16:The world <Happy> is the same with "glad".
例9.
vbird@Ubuntu1604:~$ grep -n "ooo*" regular_express.txt 1:"Open Source" is a good mechanism to develop programs. 2:apple is my favorite food. 3:Football game is not use feet only. 9:Oh! The soup taste good. 18:google is the best tools for search keyword. 19:goooooogle yes!
例10.
vbird@Ubuntu1604:~$ grep -n "go{2}d" regular_express.txt 1:"Open Source" is a good mechanism to develop programs. 9:Oh! The soup taste good.
例11.
vbird@Ubuntu1604:~$ grep -n "go{2,}" regular_express.txt 1:"Open Source" is a good mechanism to develop programs. 9:Oh! The soup taste good. 18:google is the best tools for search keyword. 19:goooooogle yes!
例12.
vbird@Ubuntu1604:~$ grep -n "go{2,3}g" regular_express.txt 18:google is the best tools for search keyword.
综合例子.在文件/etc/manpath.config中,去除空白行和以#开头行,然后查找还有"opt"字符串的数据.
vbird@Ubuntu1604:~$ grep -v "^$" /etc/manpath.config | grep -v "^#" | grep "opt" MANPATH_MAP /opt/bin /opt/man MANPATH_MAP /opt/sbin /opt/man MANDB_MAP /opt/man /var/cache/man/opt