zoukankan      html  css  js  c++  java
  • python xpath基础 01

    from lxml import etree
    
    text = '''
    <div>
        <ul>
             <li class="item-0"><a href="link1.html">first item</a></li>
             <li class="item-1"><a href="link2.html">second item</a></li>
             <li class="item-inactive"><a href="link3.html">third item</a></li>
             <li class="item-1"><a href="link4.html">fourth item</a></li>
             <li class="item-0"><a href="link5.html">fifth item</a>
         </ul>
     </div>
    '''
    html = etree.HTML(text)#构造了一个XPath解析对象并对HTML文本进行自动修正
    result = etree.tostring(html)#输出修正后的结果,类型是bytes
    print(result.decode('utf-8'))#以utf8的编码的方式打印修正后的内容
    # 修正后的内容
    test_data='''<html><body><div>
        <ul>
             <li class="item-0"><a href="link1.html">first item</a></li>
             <li class="item-1"><a href="link2.html">second item</a></li>
             <li class="item-inactive"><a href="link3.html">third item</a></li>
             <li class="item-1"><a href="link4.html">fourth item</a></li>
             <li class="item-0"><a href="link5.html">fifth item</a>
         </li></ul>
     </div>
    </body></html>'''
    

      

  • 相关阅读:
    P2351 [SDOI2012]吊灯
    洛谷P1450 [HAOI2008]硬币购物 背包+容斥
    P5110 块速递推-光速幂、斐波那契数列通项
    AT2304 Cleaning
    CSP-S 2020
    CF487E Tourists
    P4334 [COI2007] Policija
    动态逆序对专练
    CF437D The Child and Zoo
    CF1032G Chattering
  • 原文地址:https://www.cnblogs.com/liangliangzz/p/10175622.html
Copyright © 2011-2022 走看看