zoukankan      html  css  js  c++  java
  • BeautifulSoup练习

    html1="""
    <!DOCTYPE html>
    <html lang="en" xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta charset="utf-8" />
    <title>我的第一个网页</title>
    <meta name="generator" content="EverEdit" />
    <meta name="author" content="" />
    <meta name="keywords" content="" />
    <meta name="description" content="" />
    </head>
    <body>
    <div class="rows">
    <a href="http://www.baidu.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color1">
    <span class="vfsd_a_title">百度</span>
    </div>
    </a>
    <a href="http://www.google.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color3">
    <span class="vfsd_a_title">Google</span>
    </div>
    </a>
    <a href="http://www.oschina.net/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">
    <span class="vfsd_a_title">Stack Overflow</span>
    </div>
    </a>
    </div>
    <p class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">你好
    <span class="vfsd_a_title">CSDN</span>
    </p>
    <p class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">
    <span class="vfsd_a_title">FaceBook</span>
    </p>
    <p class="nmn" id="nmn1">
    <span class="vfsd_a_title">开源中国</span>
    </p>
    </body>
    </html>
    """

    from bs4 import BeautifulSoup
    soup = BeautifulSoup(html1,'lxml')

    print(soup.title)

    ####################输出:

    <title>我的第一个网页</title>

    print(soup.title.string)

    ####################输出:

    我的第一个网页

    print(soup.head)

    ####################输出:

    <head>
    <meta charset="utf-8"/>
    <title>我的第一个网页</title>
    <meta content="EverEdit" name="generator"/>
    <meta content="" name="author"/>
    <meta content="" name="keywords"/>
    <meta content="" name="description"/>
    </head>

    for i,child in enumerate(soup.div.children):
      print(i,child)

    ####################输出:

    ['
    ', <a href="http://www.baidu.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color1">
    <span class="vfsd_a_title">百度</span>
    </div>
    </a>, '
    ', <a href="http://www.google.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color3">
    <span class="vfsd_a_title">Google</span>
    </div>
    </a>, '
    ', <a href="http://www.oschina.net/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">
    <span class="vfsd_a_title">Stack Overflow</span>
    </div>
    </a>, '
    ']

  • 相关阅读:
    嵌套使用Using Statement造成对象被dispose多次 CA2202
    ASP.NET 4.0: 请求验证模式变化导致ValidateRequest=false失效
    IIS 7.0的集成模式和经典模式
    设计模式之—简单工厂设计模式
    c#总结(一)
    数据库分离附加工具
    深入理解C#之 参数传递 ref out params
    ASP.NET MVC 学习笔记(一)
    C#实现根据IP 查找真实地址
    c# 新特性
  • 原文地址:https://www.cnblogs.com/herd/p/9570983.html
Copyright © 2011-2022 走看看