zoukankan      html  css  js  c++  java
  • 正则心得

    }
    </script>
    <script language=javascript>
    ati('#', 'http://www.cnblogs.com/../UpLoadFile/Product/20101010162153846.jpg', '加厚青色围脖');
    要匹配换行符
    使用如下
    <script language=javascript>\s\sati
    这里使用\s两个应该是匹配\r\n的源因
    也可以使用\s*?来获得更通用的效果
    <tr><td><a href='(?P<link>/Product/Detail_\d*.html)'[\s\S]*?><img src='(?P<img>[^']*)' width='130' height='130'
    过份依赖[\s\S]*会造成回溯引用,使程序死住,上面是我改进过的程序,之前程序就一直挂着,原先那个都用[\s\S]*?的我没有保存,建议使用[^']*这样的进行替代
    使用
    <div class="goodsItem">[\s\s]*?<a href="(?P<link>[^"]*?)" target="_blank"><img src="(?P<img>[^"]*?)"
    而不是
    <div class="goodsItem">[\s\s]*?<a href="(?P<link>[\s\S]*?)" target="_blank"><img src="(?P<img>[\s\S]*?)"
    re.finditer(pattern, string[, flags])

    Return an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right, and matches are returned in the order found. Empty matches are included in the result unless they touch the beginning of another match.

    7.2.6.9. Raw String Notation

    Raw string notation (r"text") keeps regular expressions sane. Without it, every backslash ('\') in a regular expression would have to be prefixed with another one to escape it. For example, the two following lines of code are functionally identical:

    >>> re.match(r"\W(.)\1\W", " ff ")
    <_sre.SRE_Match object at ...>
    >>> re.match("\\W(.)\\1\\W", " ff ")
    <_sre.SRE_Match object at ...>
    

    When one wants to match a literal backslash, it must be escaped in the regular expression. With raw string notation, this means r"\\". Without raw string notation, one must use "\\\\", making the following lines of code functionally identical:

    >>> re.match(r"\\", r"\\")
    <_sre.SRE_Match object at ...>
    >>> re.match("\\\\", r"\\")
    <_sre.SRE_Match object at ...>
    


    20101015更新
    对于诸如
    <div class="listPic"><a href="/?mod=goods&amp;do=display&amp;id=2032&amp;sid=f11ee838a106889a37abf4e9227a03fe" target="_blank"><img src='/upload/photobase/2010-09/100924112121_s.jpg' border="0" title="新款 银色小雏菊三叶草满钻白色珍珠开口戒指" /></a>
    的匹配,我们可以使用如下的回溯引用来达到前后一致匹配的效果,这里还要注意,以括号命名的就是名组,只不过类似link,img是named group,另一种(‘|”)未显式的标识出来,但都占用数字位从1开始,因此,    1          2       这个不占                      3           4        
    <div class="listPic"><a[\s\S]*?href=("|')(?P<link>[^"]*?)\1[\s\S]*?<img[\s\S]*?src=("|')(?P<img>[^"]*?)\3
  • 相关阅读:
    一文总结十大经典排序算法(思维导图 + 动图演示 + 代码实现 C/C++/Python + 致命吐槽)
    VulnHub——Kioptrix Level 2
    史上最全Redis面试题(2020最新版)
    js 根据秒数获取多少小时,多少分钟,多少秒
    RabbitMQ的死信队列
    女朋友也能看懂的多线程同步
    RabbitMQ的备份交换器
    BI Publisher(rtf)模板开发语法大全
    修改CUSTOM.PLL文件调用客户化FORM&修改标准FORM
    EBS客户化迁移SQL
  • 原文地址:https://www.cnblogs.com/lexus/p/1847797.html
Copyright © 2011-2022 走看看