zoukankan      html  css  js  c++  java
  • MacOS下安装BeautifulSoup库及使用

    BeautifulSoup简介


    BeautifulSoup库是一个强大的python第三方库,它可以解析html进行解析,并提取信息。

    安装BeautifulSoup


    • 打开终端,输入命令:
    pip3 install beautifulsoup4
    

    BeautifulSoup库小测


    • 查看它的源代码:

    • 用request库获得源代码(存放在变量demo中):
    >>> import requests
    >>> r = requests.get("http://python123.io/ws/demo.html")
    >>> r.text
    '<html><head><title>This is a python demo page</title></head>
    <body>
    <p class="title"><b>The demo python introduces several python courses.</b></p>
    <p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:
    <a href="http://www.icourse163.org/course/BIT-268001" class="py1" id="link1">Basic Python</a> and <a href="http://www.icourse163.org/course/BIT-1001870001" class="py2" id="link2">Advanced Python</a>.</p>
    </body></html>'
    >>> demo = r.text
    
    • 导入BeautifulSoup库
    >>> from bs4 import BeautifulSoup
    >>> 
    
    • 使用BeautifulSoup库解析html信息
    >>> demo = r.text
    >>> soup = BeautifulSoup(demo,'html.parser')
    >>> print(soup.prettify)
    <bound method Tag.prettify of <html><head><title>This is a python demo page</title></head>
    <body>
    <p class="title"><b>The demo python introduces several python courses.</b></p>
    <p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:
    <a class="py1" href="http://www.icourse163.org/course/BIT-268001" id="link1">Basic Python</a> and <a class="py2" href="http://www.icourse163.org/course/BIT-1001870001" id="link2">Advanced Python</a>.</p>
    </body></html>>
    >>> 
    

    如何使用BeautifulSoup库?

    • 代码框架:
    from bs4 import BeautifulSoup
    soup = BeautifulSoup('<p>data</p>','html.parser')
    
    • 其中BeautifulSoup的两个参数:
      • 第一个代表我们要解析的html格式的信息。
      • 第二个代表解析所使用到的解析器
  • 相关阅读:
    iSCSI又称为IPSAN
    文档类型定义DTD
    HDU 2971 Tower
    HDU 1588 Gauss Fibonacci
    URAL 1005 Stone Pile
    URAL 1003 Parity
    URAL 1002 Phone Numbers
    URAL 1007 Code Words
    HDU 3306 Another kind of Fibonacci
    FZU 1683 纪念SlingShot
  • 原文地址:https://www.cnblogs.com/031602523liu/p/9824907.html
Copyright © 2011-2022 走看看