zoukankan      html  css  js  c++  java
  • BeautifulSoup练习

    html1="""
    <!DOCTYPE html>
    <html lang="en" xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta charset="utf-8" />
    <title>我的第一个网页</title>
    <meta name="generator" content="EverEdit" />
    <meta name="author" content="" />
    <meta name="keywords" content="" />
    <meta name="description" content="" />
    </head>
    <body>
    <div class="rows">
    <a href="http://www.baidu.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color1">
    <span class="vfsd_a_title">百度</span>
    </div>
    </a>
    <a href="http://www.google.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color3">
    <span class="vfsd_a_title">Google</span>
    </div>
    </a>
    <a href="http://www.oschina.net/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">
    <span class="vfsd_a_title">Stack Overflow</span>
    </div>
    </a>
    </div>
    <p class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">你好
    <span class="vfsd_a_title">CSDN</span>
    </p>
    <p class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">
    <span class="vfsd_a_title">FaceBook</span>
    </p>
    <p class="nmn" id="nmn1">
    <span class="vfsd_a_title">开源中国</span>
    </p>
    </body>
    </html>
    """

    from bs4 import BeautifulSoup
    soup = BeautifulSoup(html1,'lxml')

    print(soup.title)

    ####################输出:

    <title>我的第一个网页</title>

    print(soup.title.string)

    ####################输出:

    我的第一个网页

    print(soup.head)

    ####################输出:

    <head>
    <meta charset="utf-8"/>
    <title>我的第一个网页</title>
    <meta content="EverEdit" name="generator"/>
    <meta content="" name="author"/>
    <meta content="" name="keywords"/>
    <meta content="" name="description"/>
    </head>

    for i,child in enumerate(soup.div.children):
      print(i,child)

    ####################输出:

    ['
    ', <a href="http://www.baidu.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color1">
    <span class="vfsd_a_title">百度</span>
    </div>
    </a>, '
    ', <a href="http://www.google.com/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color3">
    <span class="vfsd_a_title">Google</span>
    </div>
    </a>, '
    ', <a href="http://www.oschina.net/" target="_blank">
    <div class="col-xs-12 col-sm-6 col-md-4 col-lg-2 vfsd-div vfsd-div-color2">
    <span class="vfsd_a_title">Stack Overflow</span>
    </div>
    </a>, '
    ']

  • 相关阅读:
    java实现简单web服务器(分析+源代码)
    Java中常见的5种WEB服务器介绍
    Android ViewDragHelper完全解析 自定义ViewGroup神器
    设置SVN,Git忽略MAC的.DS_Store文件的方法
    Android中图片大小和屏幕密度的关系讲解
    Android组件化方案及组件消息总线modular-event实战
    机器学习在美团配送系统的实践:用技术还原真实世界
    智能支付稳定性测试实战
    数据库智能运维探索与实践
    【人物志】技术十年:美团第一位前端工程师潘魏增
  • 原文地址:https://www.cnblogs.com/herd/p/9570983.html
Copyright © 2011-2022 走看看