zoukankan      html  css  js  c++  java
  • MacOS下安装BeautifulSoup库及使用

    BeautifulSoup简介


    BeautifulSoup库是一个强大的python第三方库,它可以解析html进行解析,并提取信息。

    安装BeautifulSoup


    • 打开终端,输入命令:
    pip3 install beautifulsoup4
    

    BeautifulSoup库小测


    • 查看它的源代码:

    • 用request库获得源代码(存放在变量demo中):
    >>> import requests
    >>> r = requests.get("http://python123.io/ws/demo.html")
    >>> r.text
    '<html><head><title>This is a python demo page</title></head>
    <body>
    <p class="title"><b>The demo python introduces several python courses.</b></p>
    <p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:
    <a href="http://www.icourse163.org/course/BIT-268001" class="py1" id="link1">Basic Python</a> and <a href="http://www.icourse163.org/course/BIT-1001870001" class="py2" id="link2">Advanced Python</a>.</p>
    </body></html>'
    >>> demo = r.text
    
    • 导入BeautifulSoup库
    >>> from bs4 import BeautifulSoup
    >>> 
    
    • 使用BeautifulSoup库解析html信息
    >>> demo = r.text
    >>> soup = BeautifulSoup(demo,'html.parser')
    >>> print(soup.prettify)
    <bound method Tag.prettify of <html><head><title>This is a python demo page</title></head>
    <body>
    <p class="title"><b>The demo python introduces several python courses.</b></p>
    <p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:
    <a class="py1" href="http://www.icourse163.org/course/BIT-268001" id="link1">Basic Python</a> and <a class="py2" href="http://www.icourse163.org/course/BIT-1001870001" id="link2">Advanced Python</a>.</p>
    </body></html>>
    >>> 
    

    如何使用BeautifulSoup库?

    • 代码框架:
    from bs4 import BeautifulSoup
    soup = BeautifulSoup('<p>data</p>','html.parser')
    
    • 其中BeautifulSoup的两个参数:
      • 第一个代表我们要解析的html格式的信息。
      • 第二个代表解析所使用到的解析器
  • 相关阅读:
    安装SQL server 2016遇到问题
    Python:dictionary
    Python: tree data structure
    python3.4 data type
    Python 3.4 Library setup
    Python 3.4 send mail
    SDN实验---Ryu的应用开发(四)北向接口RESTAPI
    SDN实验---Ryu的应用开发(四)基于跳数的最短路径转发原理
    SDN实验---Ryu的应用开发(三)流量监控
    python---基础知识回顾(十)进程和线程(协程gevent:线程在I/O请求上的优化)
  • 原文地址:https://www.cnblogs.com/031602523liu/p/9824907.html
Copyright © 2011-2022 走看看