zoukankan      html  css  js  c++  java
  • BeautifulSoup库children(),descendants()方法的使用

    BeautifulSoup库children(),descendants()方法的使用

    示例网站:http://www.pythonscraping.com/pages/page3.html

    网站内容:

    网站部分重要源代码:

    <table id="giftList">
    <tr><th>
    Item Title
    </th><th>
    Description
    </th><th>
    Cost
    </th><th>
    Image
    </th></tr>
    
    <tr id="gift1" class="gift"><td>
    Vegetable Basket
    </td><td>
    This vegetable basket is the perfect gift for your health conscious (or overweight) friends!
    <span class="excitingNote">Now with super-colorful bell peppers!</span>
    </td><td>
    $15.00
    </td><td>
    <img src="../img/gifts/img1.jpg">
    </td></tr>
    
    <tr id="gift2" class="gift"><td>
    Russian Nesting Dolls
    </td><td>
    Hand-painted by trained monkeys, these exquisite dolls are priceless! And by "priceless," we mean "extremely expensive"! <span class="excitingNote">8 entire dolls per set! Octuple the presents!</span>
    </td><td>
    $10,000.52
    </td><td>
    <img src="../img/gifts/img2.jpg">
    </td></tr>
    
    <tr id="gift3" class="gift"><td>
    Fish Painting
    </td><td>
    If something seems fishy about this painting, it's because it's a fish! <span class="excitingNote">Also hand-painted by trained monkeys!</span>
    </td><td>
    $10,005.00
    </td><td>
    <img src="../img/gifts/img3.jpg">
    </td></tr>
    
    <tr id="gift4" class="gift"><td>
    Dead Parrot
    </td><td>
    This is an ex-parrot! <span class="excitingNote">Or maybe he's only resting?</span>
    </td><td>
    $0.50
    </td><td>
    <img src="../img/gifts/img4.jpg">
    </td></tr>
    
    <tr id="gift5" class="gift"><td>
    Mystery Box
    </td><td>
    If you love suprises, this mystery box is for you! Do not place on light-colored surfaces. May cause oil staining. <span class="excitingNote">Keep your friends guessing!</span>
    </td><td>
    $1.50
    </td><td>
    <img src="../img/gifts/img6.jpg">
    </td></tr>
    </table>
    

     1.children()方法的使用

     

    # -*- coding: utf-8 -*-
    from urllib.request import urlopen
    from bs4 import BeautifulSoup
    html = urlopen("http://www.pythonscraping.com/pages/page3.html")
    bsObj = BeautifulSoup(html,"lxml")
    
    for child in bsObj.find("table",{"id":"giftList"}).children:
        print(child)
    

     

     运行得到的结果为:

    <tr><th>
    Item Title
    </th><th>
    Description
    </th><th>
    Cost
    </th><th>
    Image
    </th></tr>
    
    
    <tr class="gift" id="gift1"><td>
    Vegetable Basket
    </td><td>
    This vegetable basket is the perfect gift for your health conscious (or overweight) friends!
    <span class="excitingNote">Now with super-colorful bell peppers!</span>
    </td><td>
    $15.00
    </td><td>
    <img src="../img/gifts/img1.jpg"/>
    </td></tr>
    
    
    <tr class="gift" id="gift2"><td>
    Russian Nesting Dolls
    </td><td>
    Hand-painted by trained monkeys, these exquisite dolls are priceless! And by "priceless," we mean "extremely expensive"! <span class="excitingNote">8 entire dolls per set! Octuple the presents!</span>
    </td><td>
    $10,000.52
    </td><td>
    <img src="../img/gifts/img2.jpg"/>
    </td></tr>
    
    
    <tr class="gift" id="gift3"><td>
    Fish Painting
    </td><td>
    If something seems fishy about this painting, it's because it's a fish! <span class="excitingNote">Also hand-painted by trained monkeys!</span>
    </td><td>
    $10,005.00
    </td><td>
    <img src="../img/gifts/img3.jpg"/>
    </td></tr>
    
    
    <tr class="gift" id="gift4"><td>
    Dead Parrot
    </td><td>
    This is an ex-parrot! <span class="excitingNote">Or maybe he's only resting?</span>
    </td><td>
    $0.50
    </td><td>
    <img src="../img/gifts/img4.jpg"/>
    </td></tr>
    
    
    <tr class="gift" id="gift5"><td>
    Mystery Box
    </td><td>
    If you love suprises, this mystery box is for you! Do not place on light-colored surfaces. May cause oil staining. <span class="excitingNote">Keep your friends guessing!</span>
    </td><td>
    $1.50
    </td><td>
    <img src="../img/gifts/img6.jpg"/>
    </td></tr>
    

     根据文章中的字面意思来分析:

    children()方法指代的是与parent离得最近(也就是下一个)标签,程序中的children指代的是tr这个标签。

    实验:将children用tr替换掉会得到与以上相同的结果吗?

    # -*- coding: utf-8 -*-
    from urllib.request import urlopen
    from bs4 import BeautifulSoup
    html = urlopen("http://www.pythonscraping.com/pages/page3.html")
    bsObj = BeautifulSoup(html,"lxml")
    
    for child in bsObj.find("table",{"id":"giftList"}).tr:
        print(child)
    

     运行结果为:

    <th>
    Item Title
    </th>
    <th>
    Description
    </th>
    <th>
    Cost
    </th>
    <th>
    Image
    </th>
    

     对以上实验结果进行分析得到:children可以列出所有的子类,而直接指定标签,则不行。

    2.descendants()方法的使用

    # -*- coding: utf-8 -*-
    from urllib.request import urlopen
    from bs4 import BeautifulSoup
    html = urlopen("http://www.pythonscraping.com/pages/page3.html")
    bsObj = BeautifulSoup(html,"lxml")
    
    for child in bsObj.find("table",{"id":"giftList"}).descendants:
        print(child)
    

     运行结果为:

    <tr><th>
    Item Title
    </th><th>
    Description
    </th><th>
    Cost
    </th><th>
    Image
    </th></tr>
    <th>
    Item Title
    </th>
    
    Item Title
    
    <th>
    Description
    </th>
    
    Description
    
    <th>
    Cost
    </th>
    
    Cost
    
    <th>
    Image
    </th>
    
    Image
    
    
    
    <tr class="gift" id="gift1"><td>
    Vegetable Basket
    </td><td>
    This vegetable basket is the perfect gift for your health conscious (or overweight) friends!
    <span class="excitingNote">Now with super-colorful bell peppers!</span>
    </td><td>
    $15.00
    </td><td>
    <img src="../img/gifts/img1.jpg"/>
    </td></tr>
    <td>
    Vegetable Basket
    </td>
    
    Vegetable Basket
    
    <td>
    This vegetable basket is the perfect gift for your health conscious (or overweight) friends!
    <span class="excitingNote">Now with super-colorful bell peppers!</span>
    </td>
    
    This vegetable basket is the perfect gift for your health conscious (or overweight) friends!
    
    <span class="excitingNote">Now with super-colorful bell peppers!</span>
    Now with super-colorful bell peppers!
    
    
    <td>
    $15.00
    </td>
    
    $15.00
    
    <td>
    <img src="../img/gifts/img1.jpg"/>
    </td>
    
    
    <img src="../img/gifts/img1.jpg"/>
    
    
    
    
    <tr class="gift" id="gift2"><td>
    Russian Nesting Dolls
    </td><td>
    Hand-painted by trained monkeys, these exquisite dolls are priceless! And by "priceless," we mean "extremely expensive"! <span class="excitingNote">8 entire dolls per set! Octuple the presents!</span>
    </td><td>
    $10,000.52
    </td><td>
    <img src="../img/gifts/img2.jpg"/>
    </td></tr>
    <td>
    Russian Nesting Dolls
    </td>
    
    Russian Nesting Dolls
    
    <td>
    Hand-painted by trained monkeys, these exquisite dolls are priceless! And by "priceless," we mean "extremely expensive"! <span class="excitingNote">8 entire dolls per set! Octuple the presents!</span>
    </td>
    
    Hand-painted by trained monkeys, these exquisite dolls are priceless! And by "priceless," we mean "extremely expensive"! 
    <span class="excitingNote">8 entire dolls per set! Octuple the presents!</span>
    8 entire dolls per set! Octuple the presents!
    
    
    <td>
    $10,000.52
    </td>
    
    $10,000.52
    
    <td>
    <img src="../img/gifts/img2.jpg"/>
    </td>
    
    
    <img src="../img/gifts/img2.jpg"/>
    
    
    
    
    <tr class="gift" id="gift3"><td>
    Fish Painting
    </td><td>
    If something seems fishy about this painting, it's because it's a fish! <span class="excitingNote">Also hand-painted by trained monkeys!</span>
    </td><td>
    $10,005.00
    </td><td>
    <img src="../img/gifts/img3.jpg"/>
    </td></tr>
    <td>
    Fish Painting
    </td>
    
    Fish Painting
    
    <td>
    If something seems fishy about this painting, it's because it's a fish! <span class="excitingNote">Also hand-painted by trained monkeys!</span>
    </td>
    
    If something seems fishy about this painting, it's because it's a fish! 
    <span class="excitingNote">Also hand-painted by trained monkeys!</span>
    Also hand-painted by trained monkeys!
    
    
    <td>
    $10,005.00
    </td>
    
    $10,005.00
    
    <td>
    <img src="../img/gifts/img3.jpg"/>
    </td>
    
    
    <img src="../img/gifts/img3.jpg"/>
    
    
    
    
    <tr class="gift" id="gift4"><td>
    Dead Parrot
    </td><td>
    This is an ex-parrot! <span class="excitingNote">Or maybe he's only resting?</span>
    </td><td>
    $0.50
    </td><td>
    <img src="../img/gifts/img4.jpg"/>
    </td></tr>
    <td>
    Dead Parrot
    </td>
    
    Dead Parrot
    
    <td>
    This is an ex-parrot! <span class="excitingNote">Or maybe he's only resting?</span>
    </td>
    
    This is an ex-parrot! 
    <span class="excitingNote">Or maybe he's only resting?</span>
    Or maybe he's only resting?
    
    
    <td>
    $0.50
    </td>
    
    $0.50
    
    <td>
    <img src="../img/gifts/img4.jpg"/>
    </td>
    
    
    <img src="../img/gifts/img4.jpg"/>
    
    
    
    
    <tr class="gift" id="gift5"><td>
    Mystery Box
    </td><td>
    If you love suprises, this mystery box is for you! Do not place on light-colored surfaces. May cause oil staining. <span class="excitingNote">Keep your friends guessing!</span>
    </td><td>
    $1.50
    </td><td>
    <img src="../img/gifts/img6.jpg"/>
    </td></tr>
    <td>
    Mystery Box
    </td>
    
    Mystery Box
    
    <td>
    If you love suprises, this mystery box is for you! Do not place on light-colored surfaces. May cause oil staining. <span class="excitingNote">Keep your friends guessing!</span>
    </td>
    
    If you love suprises, this mystery box is for you! Do not place on light-colored surfaces. May cause oil staining. 
    <span class="excitingNote">Keep your friends guessing!</span>
    Keep your friends guessing!
    
    
    <td>
    $1.50
    </td>
    
    $1.50
    
    <td>
    <img src="../img/gifts/img6.jpg"/>
    </td>
    
    
    <img src="../img/gifts/img6.jpg"/>
    
  • 相关阅读:
    vue nextTick使用
    Vue slot插槽内容分发
    vue 项目设置实现通过本地手机访问
    vue router mode模式在webpack 打包上线问题
    html设置 hight100%问题
    更新模块通知栏显看不到当前进度,以及更新下载中可以清理通知问题,华为强制更新退出软件后台下载不显示通知问题
    ScrollView下嵌套GridView或ListView默认不在顶部的解决方法
    文件说明注释
    EditText双光标问题
    原 android重启应用(应用重新启动自身)
  • 原文地址:https://www.cnblogs.com/chensimin1990/p/6725803.html
Copyright © 2011-2022 走看看