zoukankan      html  css  js  c++  java
  • python正则替换中的unicode问题

    17-19,结论:对于unicode字符串,要用unicode正则字符串查找,unicode正则字符串的构成有两个要件,一是字符串本身由unicode字符构成,二是用'u'引导(python2);
    20-23,结论:对于str字符串,要用二进制正则字符串查找,用'r'引导,如果用'u'引导,因为并不包含unicode字符,其实际上不起作用,相当于'r';

    In [17]: %paste
    import re
    html = u"abcd 32人喜欢 efgh"
    html = re.sub(ur'(d+)u4ebau559cu6b22', r'<div class="like">1</div>', html)
    print html
    
    ## -- End pasted text --
    abcd <div class="like">32</div> efgh
    
    In [18]: %paste
    import re
    html = u"abcd 32人喜欢 efgh"
    html = re.sub(r'(d+)u4ebau559cu6b22', r'<div class="like">1</div>', html)
    print html
    
    ## -- End pasted text --
    abcd 32人喜欢 efgh
    
    In [19]: %paste
    import re
    html = u"abcd 32人喜欢 efgh"
    html = re.sub(r'(d+)xe4xbaxbaxe5x96x9cxe6xacxa2', r'<div class="like">1</div>', html)
    print html
    
    ## -- End pasted text --
    abcd 32人喜欢 efgh
    
    In [20]: %paste
    import re
    html = "abcd 32人喜欢 efgh"
    html = re.sub(r'(d+)u4ebau559cu6b22', r'<div class="like">1</div>', html)
    print html
    
    ## -- End pasted text --
    abcd 32浜哄枩娆?efgh
    
    In [21]: %paste
    import re
    html = "abcd 32人喜欢 efgh"
    html = re.sub(ur'(d+)u4ebau559cu6b22', r'<div class="like">1</div>', html)
    print html
    
    ## -- End pasted text --
    abcd 32浜哄枩娆?efgh
    
    In [22]: %paste
    import re
    html = "abcd 32人喜欢 efgh"
    html = re.sub(r'(d+)xe4xbaxbaxe5x96x9cxe6xacxa2', r'<div class="like">1</div>', html)
    print html
    
    ## -- End pasted text --
    abcd <div class="like">32</div> efgh
    
    In [23]: %paste
    import re
    html = "abcd 32人喜欢 efgh"
    html = re.sub(ur'(d+)xe4xbaxbaxe5x96x9cxe6xacxa2', r'<div class="like">1</div>', html)
    print html
    
    ## -- End pasted text --
    abcd <div class="like">32</div> efgh
    
    本文原创发表于http://www.cnblogs.com/qijj,转载请保留此声明。
  • 相关阅读:
    hdu-5112-A Curious Matt
    nyoj-47-过河问题|POJ-1700-Crossing River
    nyoj-914-Yougth的最大化
    nyoj-1103-区域赛系列一多边形划分
    nyoj-586-疯牛|poj-2456-Aggressive cows
    nyoj-248-buying feed
    nyoj-236-心急的C小加
    nyoj-14-会场安排问题
    Codeforces Round #277.5 (Div. 2)-D. Unbearable Controversy of Being
    Codeforces Round #277.5 (Div. 2)-C. Given Length and Sum of Digits...
  • 原文地址:https://www.cnblogs.com/qijj/p/6383535.html
Copyright © 2011-2022 走看看