正则表达式
一. 动机
  1. 文本处理已经成为计算机常见工作之一
     2. 对文本内容搜索,定位,提取是逻辑比较复杂的工作
     3. 为了快速方便的解决上述问题,产生了正则表达式技术
二. 简介
定义:即文本的高级匹配模式,提供搜索,替换等功能。其本质是一系列由字符和特殊符号构成的字串,这个字串即正则表达式
匹配原理:通过普通字符和有特定含义的字符,来组成字符串,用以描述一定的字符串规则,比如重复,位置等,来表达一种特定类型的字符串,进而匹配。
    目标:1. 熟练掌握正则表达式符号
               2. 能够读懂常用正则表达式,编辑基本表达式匹配内容
                 3. 能够熟练使用re模块操作正则表达式
 三. 元字符的使用
1. 普通字符
匹配规则:每个普通字符匹配其对应的字符
          In [14]: re.findall('ab','abcdefabcda')
             Out[14]: ['ab', 'ab']
         
         注意事项:正则表达式也可以匹配中文
     
     2. 或
      元字符 : | 
         匹配规则:匹配 | 两侧任意正则表达式即可
      In [17]: re.findall('ab|ef','abcdefabcda')
             Out[17]: ['ab', 'ef', 'ab']
     
     3. 匹配开始位置
      元字符: ^
         匹配规则: 匹配目标字符串的开始位置
          In [21]: re.findall('^Jame','Jame,hello')
             Out[21]: ['Jame']
     ![_thumb1[4] _thumb1[4]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091004262-1369148854.png)
      元字符 :$
         匹配规则: 匹配目标字符串的结束位置
      In [24]: re.findall('Jame$','Hi,Jame')
             Out[24]: ['Jame']
         
         注意事项:如果有^和$必然是出现在正则表达式的开始和结尾,如果两者同时出现则表示正则表达式要匹配目标字符串的全部内容。![_thumb1[5] _thumb1[5]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091006426-916168480.png)
5. 匹配任意字符
      元字符: .
         匹配规则: 匹配除换行外的任意一个字符
            In [32]: re.findall('小.',"小红说小明的成绩不如小王。")
             Out[32]: ['小红', '小明', '小王']
       
   
   6. 匹配字符集中字符
      元字符:[字符集]
         匹配规则:匹配字符集中任意一个字符
        表达形式:[#abc好坏] --> 任意一个[]中的字符,a,b,c
                   [0-9][a-z][A-Z]--> 区间内的任意一个字符
                             [_#%a-z0-9]-->混合书写,一般区间在后面
      In [37]: re.findall('[aeiou0-9]',"hello 119")
             Out[37]: ['e', 'o', '1', '1', '9']
     ![_thumb1[2] _thumb1[2]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091009289-1465389738.png)
     7. 匹配字符集反集
      元字符: [^...]
         匹配规则: 匹配除了字符集中的任意一个字符
          In [40]: re.findall('[^0-9]',"hello 007")
             Out[40]: ['h', 'e', 'l', 'l', 'o', ' ']
   ![_thumb1[3] _thumb1[3]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091010346-1062869987.png)
     8. 匹配重复
      元字符 : *
         匹配规则:匹配前面的字符出现0次或者多次
            In [42]: re.findall('ab*',"abbcadefabbbbbb")
             Out[42]: ['abb', 'a', 'abbbbbb']
        注意事项: ab* 当表达b出现0次时是 a 而不是 ab。*与前面的字符作为一个整体表达。
     ![_thumb1[1] _thumb1[1]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091011451-1048004851.png)
     9. 匹配重复
      元字符 : +
         匹配规则: 匹配前面的字符出现1次或多次
      In [44]: re.findall('ab+',"abbcadefabbbbbb")
             Out[44]: ['abb', 'abbbbbb']
     
     10. 匹配重复
      元字符: ?
         匹配规则: 匹配前面的字符出现0次或1次
      In [59]: re.findall('https?',"http,https://abce")
       Out[59]: ['http', 'https']
11. 匹配重复
      元字符: {n}
         匹配规则: 匹配前面的字符出现n次
          [62]:re.findall('[0-9]{3}',"100,12306,10086,10010")
             Out[62]: ['100', '123', '100', '100']
   
     12. 匹配重复
      元字符 : {m,n}
         匹配规则:匹配前面的字符出现m--n次
      [63]: re.findall('[_0-9a-zA-Z]{6,8}',"123abcdef")
             Out[63]: ['123abcde']
      元字符: d    D
         匹配规则:d 匹配任意数字字符  [0-9]
                             D 匹配任意非数字字符 [^0-9]
      In [66]: re.findall('1d{10}',"13866495723")
             Out[66]: ['13866495723']
    14. 匹配(非)普通字符
       
         元字符: w  W
         匹配规则:w 匹配普通字符
                             w 匹配非普通字符
         说明 :普通字符指 数字 字母 下划线和普通汉字 
            [69]: re.findall('w+','PORT_1065,Error 44%,下降')
           Out[69]: ['PORT_1065', 'Error', '44', '下降']
  15. 匹配(非)空字符
       
         元字符: s   S
         匹配规则: s 匹配任意一个空字符
                    S 匹配任意一个非空字符
         说明:空字符指 空格  
  
  	  v  f 字符
      In [72]: re.findall('w+s+w+','hello   world')
             Out[72]: ['hello   world']
     
     16. 匹配字符串开头结尾位置
      元字符: A   
         匹配规则:A 匹配字符串开头位置 ^
                              匹配字符串结尾位置 $
17. 匹配(非)单词边界
      元字符:    B
         匹配规则:  匹配单词边界位置
                              B 匹配非单词边界位置
         说明:单词边界位置指数字字母下划线或汉字和其他字符的交界位置。
          In [77]: re.findall(r'Bis','This is a test')
             Out[77]: ['is']

     总结
     
     匹配单个字符:.  [...] [^...] d D w W s S
     匹配重复: *  +  ?  {n}  {m,n}
     匹配位置: ^  $  A      B
     其他:|  ()  
四. 正则表达式转义
1. 特殊符号:. * + ? ^ $ [] () {} |
  2. 如果在正则表达式中匹配特殊字符,则需要加转义
        
          In [87]: re.findall('$d+','$100')
          Out[87]: ['$100']
3. raw字符串的使用
       python字符串 --> 正则 -->  目标字符串
           "\$\d+"      $d+        "$100"
             r"$d+"
         
         * 为了避免特殊字符在字符串中使用时转义的麻烦,经常使用raw字符串表达正则表达式。
 五. 贪婪 与 非贪婪
贪婪模式:正则表达式的重复匹配总是尽可能向后匹配更多的内容。比如:* + ? {m,n}
非贪婪(懒惰)模式:满足重复条件即不向后匹配更多内容
贪婪--》非贪婪: *? +? ?? {m,n}?
    In [109]: re.findall(r'a.*?b','acdb,aiob,aedb')
         Out[109]: ['acdb', 'aiob', 'aedb']
六. 正则表达式分组
定义: 使用()可以为正则表达式建立内部分组,子组是正则表达式的一个内部整体。
    作用:1. 可以被作为整体操作,改变某些元字符操作对象
       
               In [115]:    re.search(r'w+.(Green|Lei)','Jame.Lei').group()
                     Out[115]: 'Jame.Lei'
                    In [112]: re.search(r'(ab)+','ababababababab').group()
                     Out[112]: 'ababababababab'
                 
                 2. 可以单独获取匹配内容中子组对应内容
                   In [118]: re.search(r'(http|https|ftp|file)://S+','file://xxxxxx').group(1)
                     Out[118]: 'file'
3. 子组命名 (捕获组)
       格式:(?P<name>pattern)
            
                 re.search(r'(?P<pig>ab)cd(ef)','abcdefgh').group('pig')
         
          作用:名称可以表达一定的含义,也可以通过名称获取组内容
     
     4. 注意事项
       【1】 一个正则表达式中可以有多个子组
         【2】 子组的作用前提是正则表达式能够匹配到内容
         【3】 子组一般从外到内,从左到右计数
         【4】 子组不要重叠,也不要过多嵌套
         
六. 正则表达式匹配原则
  1. 正确性,能够正确的匹配出目标字符串
     2. 排他性,除了目标内容,尽可能不会匹配到其他内容
     3. 全面性,尽可能对目标字符串考虑全面,做到不遗漏
 七. Python re模块使用
  regex = compile(pattern,flags=0)
     功能:生成正则表达式对象
     参数:pattern  正则表达式
                 flags  功能标识,扩展正则匹配功能
     返回值:正则对象
    re.findall(pattern,string,flags=0)
     功能:通过正则表达式匹配目标字符串内容
     参数:pattern 正则表达式
                 string 目标字符串
     返回值: 返回匹配到的内容列表,如果正则表达式存在子组则只返回子组对应的内容。
     regex.findall(string,pos,endpos)
     功能:通过正则表达式匹配目标字符串内容
     参数:string  目标字符串
                 pos 截取字符串开始位置,默认表示字符串开头
                 endpos  截取字符串结束位置,默认为字符串结尾
     返回值: 返回匹配到的内容列表,如果正则表达式存在子组则只返回子组对应的内容。
    re.split(pattern,string,flags=0)
     功能:使用正则表达式匹配内容切割字符串
     参数:pattern 正则
                 string  目标字符串
     返回值: 切割后的字符串列表
    re.sub(pattern,replace,string,max,flags=0)
     功能: 使用指定字符串替换正则表达式匹配内容
     参数:pattern  正则
                 replace  指定字符串
                 string   目标字符串
                 max      最多替换几处,默认全部替换
     返回值: 替换后的字符串
   re.subn() : 功能参数同sub,返回值多一个实际替换个数
    re.finditer(pattern,string,flags=0)
     功能:使用正则表达式匹配目标内容
     参数:pattern 正则
                 string  目标字符串
     返回值 :迭代对象
  re.fullmatch(pattern,string,flags=0)
     功能:完全匹配某个目标字符串
     参数:pattern 正则
                 string  目标字符串
     返回值:匹配内容match object
    
     re.match(pattern,string,flags=0)
     功能:匹配某个目标字符串开始位置
     参数:pattern 正则
                 string  目标字符串
     返回值:匹配内容match object
  
   re.search(pattern,string,flags=0)
     功能:匹配目标字符串第一个符合内容
     参数:pattern 正则
                 string  目标字符串
     返回值:匹配内容match object
  compile对象属性
       
         【1】 flags : flags值
         【2】 pattern : 正则表达式
         【3】 groups : 子组数量
         【4】 groupindex : 捕获组名与组序号的字典
 
作业: 1. 熟练掌握正则表达式元字符
              2. 将re模块调用函数使用regex对象操作练习
              3. 找一个文档完成如下操作
                【1】 找到所有大写字母开头的单词
                  【2】 找到其中所有数字,数字包含整数,小数,分数,百分数,负数 (123 1.23 -1.5 -6 45% 1/2)
                  【3】 将所有日期格式2019-1-23变为2019.1.23
 
 1 Regular Expression Syntax 2 A regular expression (or RE) specifies a set of strings that matches it; the functions in this module let you 1/2 check 123 if a particular string matches a given regular expression (or if a given regular 45% expression 45.5%matches a particular string, which comes down to the same thing). testMar 3 4 Regular expressions No.1 can be 1.234 concatenated to form new regular expressions; if A and B are both regular expressions, -3 then AB is also a regular expression. In general, if a -1.2 string p matches A and another string q matches B, the string pq will match AB. This holds unless A or B contain low precedence operations; boundary conditions between A and B; or have numbered group references. Thus, complex expressions can easily be constructed from simpler primitive expressions like the ones described here. For details of the theory and implementation of regular expressions, consult the Friedl book [Frie09], or almost any textbook about compiler construction. 5 6 A brief explanation of the format of regular expressions follows. For further information and a gentler presentation, consult the Regular Expression HOWTO. Python3 7 20190206 8 2019-02-06
 
 1 import re 2 3 f = open('test') 4 data = f.read() 5 6 #大写字母开头单词 7 pattern1 = r'[A-Z]S*' 8 pattern1 = r'[A-Z]S*?' 9 #数字 10 pattern2 = r"-?d+.?/?d*%?" 11 #日期格式替换 12 pattern3 = r'd{4}-d{1,2}-d{1,2}' 13 14 regex = re.compile(pattern3) 15 for i in regex.finditer(data): 16 # print(i.group()) 17 s = i.group() 18 print(re.sub(r'-','.',s)) 19 20 f.close()
****************************************************************************
复习:
    1.什么是正则
    2.元字符
    3.正则表达式转义    贪婪    分组 
    4.正则表达式匹配原则    正确性    排他性    
     5.re模块使用
            re    模块调用
            compile对象调用
            match对象调用
****************************************************************************
一.match对象的属性方法
    1.属性变量
        pos        :匹配的目标字符串开始位置
        endpos    :匹配的目标字符串开始位置
        re        :正则表达式
        string    :目标字符串
        lastgroup:最后一组的名称
        lastindex:最后一组的序号
     
 
 1 import re 2 3 pattern = r"(ab)cd(?P<pig>ef)" 4 5 regex = re.compile(pattern) 6 7 #获取match对象 8 #obj = regex.search('abcdefgh') 9 obj = regex.search('abcdefgh',pos=0,endpos=6) 10 #####match 属性变量 #### 11 print(obj.pos) 12 print(obj.endpos) 13 print(obj.re) 14 print(obj.string) 15 16 print(obj.lastgroup) 17 print(obj.lastindex) 18 19 #############match属性方法########## 20 print("************************************match属性方法**********************************") 21 print(obj.span()) 22 print(obj.start()) 23 print(obj.end()) 24 25 print(obj.groupdict()) 26 print(obj.groups()) 27 28 print(obj.group()) 29 print(obj.group(0)) 30 print(obj.group(1))#获取第一个子组内容 31 print(obj.group('pig'))#获取pig组对应内容
![match属性_thumb[1] match属性_thumb[1]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091022346-206898241.png) 
         2.属性方法
span() 匹配内容的起止位置
start() 匹配内容的开始位置
end() 匹配内容的结束位置
groupdict()获取捕获组字典
groups()获取子组对应内容
![获取子组内容_thumb[1] 获取子组内容_thumb[1]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091023552-1591891030.png) 
     group(n = 0)
功能:获取match对象匹配内容
参数:默认为0表示获取整个match对象内容
如果是序列号或者组名则表示获取对应子组内容
返回值:匹配字符串
 
 1 RP/0/RSP0/CPU0:2_c-leaf-1# show interfaces Thu Sep 7 15:17:18.514 UTC BVI1 is down, line protocol is down Interface state transitions: 0 Hardware is Bridge-Group Virtual Interface, address is 10f3.116c.e6a7 Internet address is 192.168.100.254/24 MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, loopback not set, ARP type ARPA, ARP timeout 04:00:00 Last input never, output never Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 0 packets input, 0 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 0 broadcast packets, 0 multicast packets 0 packets output, 0 bytes, 0 total output drops Output 0 broadcast packets, 0 multicast packets BVI100 is down, line protocol is down Interface state transitions: 6 Hardware is Bridge-Group Virtual Interface, address is 10f3.116c.e6a7 Internet address is 192.168.1.100/24 MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, loopback not set, Last link flapped 3w1d ARP type ARPA, ARP timeout 04:00:00 Last input 3w1d, output 3w1d Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 3196 packets input, 151340 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 65 broadcast packets, 12 multicast packets 3133 packets output, 132666 bytes, 0 total output drops Output 9 broadcast packets, 0 multicast packets BVI101 is down, line protocol is down Interface state transitions: 2 Hardware is Bridge-Group Virtual Interface, address is 10f3.116c.e6a7 Internet address is 1.1.1.2/24 MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, loopback not set, Last link flapped 8w0d ARP type ARPA, ARP timeout 04:00:00 Last input never, output 8w2d Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 0 packets input, 0 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 0 broadcast packets, 0 multicast packets 1 packets output, 42 bytes, 0 total output drops Output 1 broadcast packets, 0 multicast packets BVI201 is up, line protocol is up Interface state transitions: 1 Hardware is Bridge-Group Virtual Interface, address is 10f3.116c.e6a7 Internet address is 192.168.1.2/24 MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, loopback not set, Last link flapped 1w0d ARP type ARPA, ARP timeout 04:00:00 Last input 00:19:00, output 00:19:00 Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 48 packets input, 2880 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 2 broadcast packets, 0 multicast packets 47 packets output, 1974 bytes, 0 total output drops Output 1 broadcast packets, 0 multicast packets Loopback0 is up, line protocol is up Interface state transitions: 1 Hardware is Loopback interface(s) Internet address is 192.168.0.2/32 MTU 1500 bytes, BW 0 Kbit reliability Unknown, txload Unknown, rxload Unknown Encapsulation Loopback, loopback not set, Last link flapped 8w2d Last input Unknown, output Unknown Last clearing of "show interface" counters Unknown Input/output data rate is disabled. Loopback100 is up, line protocol is up Interface state transitions: 1 Hardware is Loopback interface(s) Description: finalBVI1 address is 10f3.116c.e6a7 Internet address is Unknown MTU 1500 bytes, BW 0 Kbit reliability Unknown, txload Unknown, rxload Unknown Encapsulation Loopback, loopback not set, Last link flapped 8w2d Last input Unknown, output Unknown Last clearing of "show interface" counters Unknown Input/output data rate is disabled. Null0 is up, line protocol is up Interface state transitions: 1 Hardware is Null interface Internet address is Unknown MTU 1500 bytes, BW 0 Kbit reliability 255/255, txload Unknown, rxload Unknown Encapsulation Null, loopback not set, Last link flapped BVI1 address is 10f3.116c.e6a7 nve1 is up, line protocol is not ready Interface state transitions: 1 Hardware is Overlay Internet address is Unknown MTU 1500 bytes, BW 0 Kbit reliability Unknown, txload Unknown, rxload Unknown Encapsulation Unknown(0), loopback not set, Last link flapped 8w2d Last input Unknown, output Unknown Last clearing of "show interface" counters Unknown Input/output data rate is disabled. tunnel-te11 is up, line protocol is up Interface state transitions: 3 Hardware is Tunnel-TE Internet address is 192.168.0.2/32 MTU 1500 bytes, BW 0 Kbit reliability 255/255, txload Unknown, rxload Unknown Encapsulation TUNNEL, loopback not set, Last link flapped 8w0d Last input never, output 00:00:00 Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 17000 bits/sec, 29 packets/sec 0 packets input, 0 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 0 broadcast packets, 0 multicast packets 23593868 packets output, 1835278695 bytes, 0 total output drops Output 0 broadcast packets, 0 multicast packets tunnel-te24 is administratively down, line protocol is administratively down Interface state transitions: 0 Hardware is Tunnel-TE Internet address is 192.168.0.2/32 MTU 1500 bytes, BW 0 Kbit reliability 255/255, txload Unknown, rxload Unknown Encapsulation TUNNEL, loopback not set, Last input never, output never Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 0 packets input, 0 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 0 broadcast packets, 0 multicast packets 0 packets output, 0 bytes, 0 total output drops Output 0 broadcast packets, 0 multicast packets MgmtEth0/RSP0/CPU0/0 is up, line protocol is up Interface state transitions: 21 Hardware is Management Ethernet, address is 10f3.114b.2539 (bia 10f3.114b.2539) Internet address is 10.124.3.85/24 MTU 1514 bytes, BW 1000000 Kbit (Max: 1000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, Full-duplex, 1000Mb/s, TFD, link type is autonegotiation output flow control is off, input flow control is off loopback not set, Last link flapped 1w4d ARP type ARPA, ARP timeout 04:00:00 Last input 00:00:10, output 00:00:10 Last clearing of "show interface" counters never 5 minute input rate 9000 bits/sec, 9 packets/sec 5 minute output rate 27000 bits/sec, 4 packets/sec 13161606523 packets input, 40516491337 bytes, 882227 total input drops 25547570 drops for unrecognized upper-level protocol Received 12930208959 broadcast packets, 16022186 multicast packets 0 runts, 0 giants, 0 throttles, 0 parity 4294967294 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 4310505650 packets output, 562967788049761 bytes, 0 total output drops Output 27 broadcast packets, 0 multicast packets 4294967293 output errors, 0 underruns, 0 applique, 0 resets 0 output buffer failures, 0 output buffers swapped out 21 carrier transitions MgmtEth0/RSP0/CPU0/1 is administratively down, line protocol is administratively down Interface state transitions: 0 Hardware is Management Ethernet, address is 10f3.114b.253a (bia 10f3.114b.253a) Internet address is Unknown MTU 1514 bytes, BW 1000000 Kbit (Max: 1000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, Duplex unknown, 1000Mb/s, THD, link type is autonegotiation output flow control is off, input flow control is off loopback not set, Last input 6w6d, output never Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 4294967295 packets input, 0 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 4294967295 broadcast packets, 0 multicast packets 0 runts, 4294967295 giants, 0 throttles, 0 parity 4294967291 input errors, 4294967295 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 0 packets output, 0 bytes, 0 total output drops Output 0 broadcast packets, 0 multicast packets 4294967292 output errors, 0 underruns, 0 applique, 0 resets 0 output buffer failures, 0 output buffers swapped out 0 carrier transitions TenGigE0/0/1/0 is administratively down, line protocol is administratively down Interface state transitions: 0 Hardware is TenGigE, address is 10f3.114b.9790 (bia 10f3.114b.9790) Layer 1 Transport Mode is LAN Internet address is Unknown MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, Full-duplex, 10000Mb/s, link type is force-up output flow control is off, input flow control is off Carrier delay (up) is 10 msec loopback not set, Last input never, output never Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 0 packets input, 0 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 0 broadcast packets, 0 multicast packets 0 runts, 0 giants, 0 throttles, 0 parity 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 0 packets output, 0 bytes, 0 total output drops Output 0 broadcast packets, 0 multicast packets 0 output errors, 0 underruns, 0 applique, 0 resets 0 output buffer failures, 0 output buffers swapped out 0 carrier transitions TenGigE0/0/1/1 is administratively down, line protocol is administratively down Interface state transitions: 0 Hardware is TenGigE, address is 10f3.114b.9791 (bia 10f3.114b.9791) Layer 1 Transport Mode is LAN Internet address is Unknown MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, Full-duplex, 10000Mb/s, link type is force-up output flow control is off, input flow control is off Carrier delay (up) is 10 msec loopback not set, Last input never, output never Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 0 packets input, 0 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 0 broadcast packets, 0 multicast packets 0 runts, 0 giants, 0 throttles, 0 parity 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 0 packets output, 0 bytes, 0 total output drops Output 0 broadcast packets, 0 multicast packets 0 output errors, 0 underruns, 0 applique, 0 resets 0 output buffer failures, 0 output buffers swapped out 0 carrier transitions TenGigE0/0/1/2 is administratively down, line protocol is administratively down Interface state transitions: 0 Hardware is TenGigE, address is 10f3.114b.9792 (bia 10f3.114b.9792) Layer 1 Transport Mode is LAN Internet address is Unknown MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, Full-duplex, 10000Mb/s, link type is force-up output flow control is off, input flow control is off Carrier delay (up) is 10 msec loopback not set, Last input never, output never Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 0 packets input, 0 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 0 broadcast packets, 0 multicast packets 0 runts, 0 giants, 0 throttles, 0 parity 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 0 packets output, 0 bytes, 0 total output drops Output 0 broadcast packets, 0 multicast packets 0 output errors, 0 underruns, 0 applique, 0 resets 0 output buffer failures, 0 output buffers swapped out 0 carrier transitions TenGigE0/0/1/3 is administratively down, line protocol is administratively down Interface state transitions: 0 Hardware is TenGigE, address is 10f3.114b.9793 (bia 10f3.114b.9793) Layer 1 Transport Mode is LAN Internet address is Unknown MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, Full-duplex, 10000Mb/s, link type is force-up output flow control is off, input flow control is off Carrier delay (up) is 10 msec loopback not set, Last input never, output never Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 0 packets input, 0 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 0 broadcast packets, 0 multicast packets 0 runts, 0 giants, 0 throttles, 0 parity 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 0 packets output, 0 bytes, 0 total output drops Output 0 broadcast packets, 0 multicast packets 0 output errors, 0 underruns, 0 applique, 0 resets 0 output buffer failures, 0 output buffers swapped out 0 carrier transitions TenGigE0/0/2/0 is down, line protocol is down Interface state transitions: 0 Hardware is TenGigE, address is 10f3.114b.9778 (bia 10f3.114b.9778) Layer 1 Transport Mode is LAN Description: to sl-102-n9k-e1/8 Internet address is 1.0.0.1/30 MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, Full-duplex, 10000Mb/s, SR, link type is force-up output flow control is off, input flow control is off Carrier delay (up) is 10 msec loopback not set, ARP type ARPA, ARP timeout 04:00:00 Last input never, output never Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 0 packets input, 0 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 0 broadcast packets, 0 multicast packets 0 runts, 0 giants, 0 throttles, 0 parity 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 0 packets output, 0 bytes, 0 total output drops Output 0 broadcast packets, 0 multicast packets 0 output errors, 0 underruns, 0 applique, 0 resets 0 output buffer failures, 0 output buffers swapped out 0 carrier transitions TenGigE0/0/2/0.49 is down, line protocol is down Interface state transitions: 0 Hardware is VLAN sub-interface(s), address is 10f3.114b.9778 Description: To 4_pa-leaf l2transport Layer 2 Transport Mode MTU 1518 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability Unknown, txload Unknown, rxload Unknown Encapsulation 802.1Q Virtual LAN, Outer Match: Dot1Q VLAN 49 Ethertype Any, MAC Match src any, dest any loopback not set, Last input never, output never Last clearing of "show interface" counters never 0 packets input, 0 bytes 0 input drops, 0 queue drops, 0 input errors 0 packets output, 0 bytes 0 output drops, 0 queue drops, 0 output errors TenGigE0/0/2/1 is up, line protocol is up Interface state transitions: 1 Hardware is TenGigE, address is 10f3.114b.9779 (bia 10f3.114b.9779) Layer 1 Transport Mode is LAN Description: To ucs vmnic2 Internet address is Unknown MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, Full-duplex, 10000Mb/s, link type is force-up output flow control is off, input flow control is off Carrier delay (up) is 10 msec loopback not set, Last link flapped 8w2d Last input 00:00:00, output 00:00:00 Last clearing of "show interface" counters never 5 minute input rate 252000 bits/sec, 175 packets/sec 5 minute output rate 247000 bits/sec, 175 packets/sec 141805642 packets input, 26104548967 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 234 broadcast packets, 1202796 multicast packets 0 runts, 0 giants, 0 throttles, 0 parity 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 142079782 packets output, 25645082088 bytes, 0 total output drops Output 97684 broadcast packets, 1371751 multicast packets 0 output errors, 0 underruns, 0 applique, 0 resets 0 output buffer failures, 0 output buffers swapped out 1 carrier transitions TenGigE0/0/2/1.12 is up, line protocol is up Interface state transitions: 1 Hardware is VLAN sub-interface(s), address is 10f3.114b.9779 Description: To 1_spine Internet address is 10.1.2.2/30 MTU 1518 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation 802.1Q Virtual LAN, VLAN Id 12, loopback not set, Last link flapped 8w2d ARP type ARPA, ARP timeout 04:00:00 Last input 00:00:00, output 00:00:00 Last clearing of "show interface" counters never 5 minute input rate 131000 bits/sec, 88 packets/sec 5 minute output rate 128000 bits/sec, 87 packets/sec 71010702 packets input, 13465886097 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 2 broadcast packets, 601452 multicast packets 71066920 packets output, 13267653721 bytes, 0 total output drops Output 1 broadcast packets, 601453 multicast packets TenGigE0/0/2/1.25 is up, line protocol is up Interface state transitions: 1 Hardware is VLAN sub-interface(s), address is 10f3.114b.9779 Description: To 5_vpe-1 Internet address is 10.2.5.1/30 MTU 1518 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation 802.1Q Virtual LAN, VLAN Id 25, loopback not set, Last link flapped 8w2d ARP type ARPA, ARP timeout 04:00:00 Last input 00:00:00, output 00:00:00 Last clearing of "show interface" counters never 5 minute input rate 105000 bits/sec, 59 packets/sec 5 minute output rate 19000 bits/sec, 29 packets/sec 47250949 packets input, 10899616965 bytes, 1 total input drops 0 drops for unrecognized upper-level protocol Received 2 broadcast packets, 601338 multicast packets 24480721 packets output, 2707473085 bytes, 0 total output drops Output 1 broadcast packets, 601492 multicast packets TenGigE0/0/2/1.100 is up, line protocol is up Interface state transitions: 1 Hardware is VLAN sub-interface(s), address is 10f3.114b.9779 Description: To 4_pa-leaf l2transport Layer 2 Transport Mode MTU 1518 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability Unknown, txload Unknown, rxload Unknown Encapsulation 802.1Q Virtual LAN, Outer Match: Dot1Q VLAN 100 Ethertype Any, MAC Match src any, dest any loopback not set, Last link flapped 8w0d Last input 00:00:00, output 00:00:00 Last clearing of "show interface" counters never 23542879 packets input, 1738375683 bytes 148 input drops, 0 queue drops, 0 input errors 46414343 packets output, 9636017874 bytes 0 output drops, 0 queue drops, 0 output errors TenGigE0/0/2/1.200 is up, line protocol is up Interface state transitions: 1 Hardware is VLAN sub-interface(s), address is 10f3.114b.9779 Description: To 4_pa-leaf l2transport Layer 2 Transport Mode MTU 1518 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability Unknown, txload Unknown, rxload Unknown Encapsulation 802.1Q Virtual LAN, Outer Match: Dot1Q VLAN 200 Ethertype Any, MAC Match src any, dest any loopback not set, Last link flapped 8w2d Last input never, output 1w1d Last clearing of "show interface" counters never 0 packets input, 0 bytes 0 input drops, 0 queue drops, 0 input errors 69 packets output, 3454 bytes 0 output drops, 0 queue drops, 0 output errors TenGigE0/0/2/2 is administratively down, line protocol is administratively down Interface state transitions: 0 Hardware is TenGigE, address is 10f3.114b.978e (bia 10f3.114b.978e) Layer 1 Transport Mode is LAN Internet address is Unknown MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, Full-duplex, 10000Mb/s, link type is force-up output flow control is off, input flow control is off Carrier delay (up) is 10 msec loopback not set, Last input never, output never Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 0 packets input, 0 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 0 broadcast packets, 0 multicast packets 0 runts, 0 giants, 0 throttles, 0 parity 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 0 packets output, 0 bytes, 0 total output drops Output 0 broadcast packets, 0 multicast packets 0 output errors, 0 underruns, 0 applique, 0 resets 0 output buffer failures, 0 output buffers swapped out 0 carrier transitions TenGigE0/0/2/3 is administratively down, line protocol is administratively down Interface state transitions: 0 Hardware is TenGigE, address is 10f3.114b.978f (bia 10f3.114b.978f) Layer 1 Transport Mode is LAN Internet address is Unknown MTU 1514 bytes, BW 10000000 Kbit (Max: 10000000 Kbit) reliability 255/255, txload 0/255, rxload 0/255 Encapsulation ARPA, Full-duplex, 10000Mb/s, SR, link type is force-up output flow control is off, input flow control is off Carrier delay (up) is 10 msec loopback not set, Last input 8w2d, output never Last clearing of "show interface" counters never 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 51 packets input, 3372 bytes, 0 total input drops 0 drops for unrecognized upper-level protocol Received 0 broadcast packets, 51 multicast packets 0 runts, 0 giants, 0 throttles, 0 parity 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 0 packets output, 0 bytes, 0 total output drops Output 0 broadcast packets, 0 multicast packets 0 output errors, 0 underruns, 0 applique, 0 resets 0 output buffer failures, 0 output buffers swapped out 0 carrier transitions
 
 1 import re 2 import sys 3 4 port = sys.argv[1] 5 6 f = open('1.txt') 7 8 #找到端口所在的对应段落 9 while True: 10 data = '' 11 for line in f: 12 if line != ' ': 13 data += line 14 else: 15 break 16 if not data: 17 print("No PORT") 18 break 19 20 # 通过首单词比对是否为目标段 21 try: 22 PORT = re.match(r'S+',data).group() 23 except Exception: 24 continue 25 if port == PORT: 26 # pattern=r"[0-9a-f]{4}.[0-9a-f]{4}.[0-9a-f]{4}" 27 pattern=r"address is ((d{1,3}.){3}d{1,3}/d+|Unknown)" 28 address = re.search(pattern,data).group(1) 29 print(address) 30 break 31 32 f.close() 33
二.flags参数扩展
        1.使用函数:re模块调用的匹配函数,如:re.compile,re.finall,re.search...
         2.作用:丰富扩展正则表达式的匹配功能
        3.常用flag
             A == ASCII   元字符只能匹配ascii字符
            I == IGNORECASE  匹配时忽略字母大小写
            S == DOTALL  作用域元字符可以使 . 可以匹配
 
             M == MULTILINE 使作用域^ $使其可以匹配每行的开头结尾
            X == VERBOSE  可以给正则表达式每行加#注释
        4.使用多个flag
             方法:按位或连接
            eg. flags = re.I | re.A 
      
 
 1 import re 2 3 pattern = r"(w+):(d+)" 4 s = "zhang:1994 li:1993" 5 6 l = re.findall(pattern,s) 7 print(l) 8 9 #切割字符串 10 l = re.split(r's+','Hello world nihao chain') 11 print(l) 12 #替换字符串匹配内容 13 s = re.sub(r'垃圾','**','张三垃圾,垃圾,垃圾',2) 14 print(s) 15 16
![registr_thumb[1] registr_thumb[1]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091028838-1035819420.png) 
    
 1 import re 2 3 # regex =re.compile(r'w+') 4 #只匹配ASCII字符 5 # regex =re.compile(r'w+',flags=re.A) 6 7 #忽略字母大小写 8 # regex = re.compile(r'[a-z]+') 9 # regex = re.compile(r'[a-z]+',re.I) 10 11 #作用域元字符可以使 . 可以匹配 12 # regex = re.compile(r'.+',flags=re.S) 13 14 #l = regex.findall('Welcome to 北京') 15 # l = regex.findall('Welcome to 北京') 16 17 #匹配每一行的开始位置 18 # regex = re.compile(r'北京',flags=re.M) 19 20 #为正则添加注释 21 pattern = r'''[A-Z][a-z]*#匹配第一个单词 22 s+w+s+ #匹配空行和第二个单词 23 w+#匹配汉字 24 ''' 25 regex = re.compile(pattern,flags=re.X) 26 s = '''Wecome to 27 北京 28 ''' 29 l = regex.findall(s) 30 print(l)
![flag_thumb[1] flag_thumb[1]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091030751-1480962478.png) 





 
 ![2019-03-08_8-55-46_thumb[2] 2019-03-08_8-55-46_thumb[2]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091020573-2096523080.png)
![group_thumb[1] group_thumb[1]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091024987-47081331.png)
![2_thumb[1] 2_thumb[1]](https://img2018.cnblogs.com/blog/1448556/201903/1448556-20190308091027061-198712825.png)