zoukankan      html  css  js  c++  java
  • PYTHON PYQUERY 基本用法

    引用方法

    from pyquery import PyQuery as pq
    View Code

    基本CSS选择器

    from pyquery import PyQuery as pq
    html = '''
        <div id="wrap">
            <ul class="s_from">
                asdasd
                <link href="http://asda.com">asdadasdad12312</link>
                <link href="http://asda1.com">asdadasdad12312</link>
                <link href="http://asda2.com">asdadasdad12312</link>
            </ul>
        </div>
    '''
    doc = pq(html)
    print doc("#wrap .s_from link")
    View Code

    运行结果

    <link href="http://asda.com">asdadasdad12312</link>
                <link href="http://asda1.com">asdadasdad12312</link>
                <link href="http://asda2.com">asdadasdad12312</link>
    View Code

    #是查找id的标签  .是查找class 的标签  link 是查找link 标签 中间的空格表示里层

    遍历查找结果

    from pyquery import PyQuery as pq
    html = '''
        <div href="wrap">
            hello nihao
            <ul class="s_from">
                asdasd
                <link class='active1 a123' href="http://asda.com">asdadasdad12312</link>
                <link class='active2' href="http://asda1.com">asdadasdad12312</link>
                <link class='movie1' href="http://asda2.com">asdadasdad12312</link>
            </ul>
        </div>
    '''
    
    doc = pq(html)
    its=doc("link").items()
    for it in its:
        print(it)
    View Code

    运行结果

    <link class="active1 a123" href="http://asda.com">asdadasdad12312</link>
                
    <link class="active2" href="http://asda1.com">asdadasdad12312</link>
                
    <link class="movie1" href="http://asda2.com">asdadasdad12312</link>
    View Code

    获取属性信息

    from pyquery import PyQuery as pq
    html = '''
        <div href="wrap">
            hello nihao
            <ul class="s_from">
                asdasd
                <link class='active1 a123' href="http://asda.com">asdadasdad12312</link>
                <link class='active2' href="http://asda1.com">asdadasdad12312</link>
                <link class='movie1' href="http://asda2.com">asdadasdad12312</link>
            </ul>
        </div>
    '''
    
    doc = pq(html)
    its=doc("link").items()
    for it in its:
        print(it.attr('href'))
        print(it.attr.href)
    View Code

      运行结果

    http://asda.com
    http://asda.com
    http://asda1.com
    http://asda1.com
    http://asda2.com
    http://asda2.com
    View Code

     获取文本

    from pyquery import PyQuery as pq
    html = '''
        <div href="wrap">
            hello nihao
            <ul class="s_from">
                asdasd
                <link class='active1 a123' href="http://asda.com">asdadasdad12312</link>
                <link class='active2' href="http://asda1.com">asdadasdad12312</link>
                <link class='movie1' href="http://asda2.com">asdadasdad12312</link>
            </ul>
        </div>
    '''
    
    doc = pq(html)
    its=doc("link").items()
    for it in its:
        print(it.text())
    View Code

     运行结果

    asdadasdad12312
    asdadasdad12312
    asdadasdad12312
    View Code
  • 相关阅读:
    性能测试必备知识(10)- Linux 是怎么管理内存的?
    stat 命令家族(4)- 详解 iostat
    stat 命令家族(3)- 详解 mpstat
    stat 命令家族(2)- 详解 pidstat
    性能分析(6)- 如何迅速分析出系统 CPU 的瓶颈在哪里
    性能分析(5)- 软中断导致 CPU 使用率过高的案例
    实体类转xml
    运气一直好,就不只是运气了——记中学七年
    (数据科学学习手札93)利用geopandas与PostGIS进行交互
    JVM系列之一:内存区域和内存溢出
  • 原文地址:https://www.cnblogs.com/xlsxls/p/9724715.html
Copyright © 2011-2022 走看看