zoukankan      html  css  js  c++  java
  • BeautifulSoup练习

    html1="""
    <!DOCTYPE html>
    <html lang="en" xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta charset="utf-8" />
    <title>我的第一个网页</title>
    <meta name="generator" content="EverEdit" />
    <meta name="author" content="" />
    <meta name="keywords" content="" />
    <meta name="description" content="" />
    </head>
    <body>
    <div class="rows">
    <a href="http://www.baidu.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color1">
    <span class="vfsd_a_title">百度</span>
    </div>
    </a>
    <a href="http://www.google.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color3">
    <span class="vfsd_a_title">Google</span>
    </div>
    </a>
    <a href="http://www.oschina.net/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">
    <span class="vfsd_a_title">Stack Overflow</span>
    </div>
    </a>
    </div>
    <p class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">你好
    <span class="vfsd_a_title">CSDN</span>
    </p>
    <p class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">
    <span class="vfsd_a_title">FaceBook</span>
    </p>
    <p class="nmn" id="nmn1">
    <span class="vfsd_a_title">开源中国</span>
    </p>
    </body>
    </html>
    """

    from bs4 import BeautifulSoup
    soup = BeautifulSoup(html1,'lxml')

    print(soup.title)

    ####################输出:

    <title>我的第一个网页</title>

    print(soup.title.string)

    ####################输出:

    我的第一个网页

    print(soup.head)

    ####################输出:

    <head>
    <meta charset="utf-8"/>
    <title>我的第一个网页</title>
    <meta content="EverEdit" name="generator"/>
    <meta content="" name="author"/>
    <meta content="" name="keywords"/>
    <meta content="" name="description"/>
    </head>

    for i,child in enumerate(soup.div.children):
      print(i,child)

    ####################输出:

    ['
    ', <a href="http://www.baidu.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color1">
    <span class="vfsd_a_title">百度</span>
    </div>
    </a>, '
    ', <a href="http://www.google.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color3">
    <span class="vfsd_a_title">Google</span>
    </div>
    </a>, '
    ', <a href="http://www.oschina.net/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">
    <span class="vfsd_a_title">Stack Overflow</span>
    </div>
    </a>, '
    ']

  • 相关阅读:
    Flutter 详解 Key
    Flutter 状态管理之BLoC
    将博客搬至CSDN
    swift 创建tableView并实现协议
    oc swift 混编 特技
    ios字符串计算高度总结
    想了解浏览器内核,看这一篇就够了
    字符编码GBK、GB2312和UTF-8的区别与联系
    开发中常用npm包随用随记
    Nodejs随学随记(杂)
  • 原文地址:https://www.cnblogs.com/herd/p/9570983.html
Copyright © 2011-2022 走看看