zoukankan      html  css  js  c++  java
  • beautifulSoup(1)

    import re
    from bs4 import BeautifulSoup
    doc = ['<html><head><title>Page title</title></head>',
           '<body><p id="firstpara" align="center">This is paragraph <b>one</b>.',
           '<p id="secondpara" align="blah">This is paragraph <b>two</b>.',
           '</html>']  
    soup = BeautifulSoup(''.join(doc))
    print(soup.prettify())
    title=soup.html.head.title
    print(title)
    print(title.string)
    print(len(soup('p')))
    print(soup.findAll('p',align='center'))
    print(soup.find('p',align='center'))
    print(soup('p',align='center')[0]['id'])
    print(soup.find('p',align=re.compile('^b.*'))['id'])
    print(soup.find('p').b.string)
    print(soup('p')[1].b.string)
    -----------------------------------------------------------------------------------

    <html>
     <head>
      <title>
       Page title
      </title>
     </head>
     <body>
      <p align="center" id="firstpara">
       This is paragraph
       <b>
        one
       </b>
       .
       <p align="blah" id="secondpara">
        This is paragraph
        <b>
         two
        </b>
        .
       </p>
      </p>
     </body>
    </html>
    <title>Page title</title>
    Page title
    2
    [<p align="center" id="firstpara">This is paragraph <b>one</b>.<p align="blah" id="secondpara">This is paragraph <b>two</b>.</p></p>]
    <p align="center" id="firstpara">This is paragraph <b>one</b>.<p align="blah" id="secondpara">This is paragraph <b>two</b>.</p></p>
    firstpara
    secondpara
    one
    two
    [Finished in 0.5s]

  • 相关阅读:
    Redis数据类型之字符串String
    Redis数据类型之列表List
    hdu 2066 一个人的旅行
    CDOJ 1221 Ancient Go
    如何避免javascript中的冲突
    利用锚点制作简单索引效果
    函数中的私有变量和特权方法
    Linux中常用命令cat
    java -jar参数运行方式设置classpath
    Linux中的数据重定向
  • 原文地址:https://www.cnblogs.com/lfqcode/p/6129556.html
Copyright © 2011-2022 走看看