zoukankan      html  css  js  c++  java
  • beautifulSoup(1)

    import re
    from bs4 import BeautifulSoup
    doc = ['<html><head><title>Page title</title></head>',
           '<body><p id="firstpara" align="center">This is paragraph <b>one</b>.',
           '<p id="secondpara" align="blah">This is paragraph <b>two</b>.',
           '</html>']  
    soup = BeautifulSoup(''.join(doc))
    print(soup.prettify())
    title=soup.html.head.title
    print(title)
    print(title.string)
    print(len(soup('p')))
    print(soup.findAll('p',align='center'))
    print(soup.find('p',align='center'))
    print(soup('p',align='center')[0]['id'])
    print(soup.find('p',align=re.compile('^b.*'))['id'])
    print(soup.find('p').b.string)
    print(soup('p')[1].b.string)
    -----------------------------------------------------------------------------------

    <html>
     <head>
      <title>
       Page title
      </title>
     </head>
     <body>
      <p align="center" id="firstpara">
       This is paragraph
       <b>
        one
       </b>
       .
       <p align="blah" id="secondpara">
        This is paragraph
        <b>
         two
        </b>
        .
       </p>
      </p>
     </body>
    </html>
    <title>Page title</title>
    Page title
    2
    [<p align="center" id="firstpara">This is paragraph <b>one</b>.<p align="blah" id="secondpara">This is paragraph <b>two</b>.</p></p>]
    <p align="center" id="firstpara">This is paragraph <b>one</b>.<p align="blah" id="secondpara">This is paragraph <b>two</b>.</p></p>
    firstpara
    secondpara
    one
    two
    [Finished in 0.5s]

  • 相关阅读:
    jquery Table基础操作
    window.opener
    CSS基础
    CSS样式
    CSS框架
    常用正则表达式
    HTML字体对应word字体
    SQL获取所有数据库名、表名、储存过程以及参数列表
    SQL集合运算:差集、交集、并集
    sql数据分页
  • 原文地址:https://www.cnblogs.com/lfqcode/p/6129556.html
Copyright © 2011-2022 走看看