zoukankan      html  css  js  c++  java
  • 安装Beautiful Soup

    为什么使用Beautiful Soup

    详细内容在崔庆才大佬的教程:https://cuiqingcai.com/1319.html

    简单来说,Beautiful Soup是python的一个库,最主要的功能是从网页抓取数据。官方解释如下:

    在爬虫的实现中,经常使用正则表达式来匹配要查找的部分,但是如果一个正则匹配稍有差池,那可能程序就处在永久的循环之中,可以使用一个更强大的工具,叫Beautiful Soup,有了它我们可以很方便地提取出HTML或XML标签中的内容。

    Beautiful Soup提供一些简单的、python式的函数用来处理导航、搜索、修改分析树等功能。它是一个工具箱,通过解析文档为用户提供需要抓取的数据,因为简单,所以不需要多少代码就可以写出一个完整的应用程序。

    Beautiful Soup自动将输入文档转换为Unicode编码,输出文档转换为utf-8编码。你不需要考虑编码方式,除非文档没有指定一个编码方式,这时,Beautiful Soup就不能自动识别编码方式了。然后,你仅仅需要说明一下原始编码方式就可以了。

    Beautiful Soup已成为和lxml、html6lib一样出色的python解释器,为用户灵活地提供不同的解析策略或强劲的速度。

    安装过程出现了链接中类似的问题,参照作者的方案

    首先输入 anaconda search -t conda beautifulsoup4,这样子就会显示可用的版本 ,我的显示效果如下所示:

    (wangli) D:Anaconda3envswangli> anaconda search -t conda beautifulsoup4
    Using Anaconda API: https://api.anaconda.org
    Packages:
         Name                      |  Version | Package Types   | Platforms       | Builds
         ------------------------- |   ------ | --------------- | --------------- | ----------
         IzODA/beautifulsoup4      |    4.6.0 | conda           | zos-z           | py37_0, py36_0
         NOAA-ORR-ERD/beautifulsoup4 |    4.3.2 | conda           | win-64, osx-64  | 0, py34_0, py33_0, py27_0
                                              : Screen-scraping library
         ODSP-TEST/beautifulsoup4  |    4.6.0 | conda           | zos-z           | py37_0, py36_0
         RahulJain/beautifulsoup4  |    4.4.1 | conda           | win-64          | py27_0
         Trentonoliphant/beautifulsoup4 |    4.3.2 | conda           | win-32, win-64  | py34_0, py33_0, py26_0, py27_0
                                              : http://www.crummy.com/software/BeautifulSoup/bs4/
         aarch64_gbox/beautifulsoup4 |    4.5.3 | conda           | linux-aarch64   | py36_0
         aetrial/beautifulsoup4    |          | conda           | linux-64, osx-64 | py35_0, py27_0
         akode/beautifulsoup4      |    4.3.2 | conda           | osx-64          | py27_0
                                              : Screen-scraping library
         alefnula/beautifulsoup4   |    4.1.3 | conda           | osx-64          | py34_0
                                              : UNKNOWN
         anaconda/beautifulsoup4   |    4.8.2 | conda           | linux-ppc64le, linux-64, win-32, osx-64, linux-32, win-64 | py37_0, py37_1, py36h6ea3382_0, py36_1, py36_0, py35hb75f182_1, py27h3f86ba9_1, py27_1, py27_0, py27hc287451_1, py27h9416283_1, py27hdc1f29e_0, py35h61fcdcc_1, py36h49b8c8c_1, py38_0, py36h4361f19_1, py27h8bb5803_1, py35h94b83b4_1, py35h50ea147_0, py34_0, py35_0, py36h72d3c9f_1, py36hd4cc5e8_1, py35h442a8c9_1, py35_1
                                              : Python library designed for screen-scraping
         anacondams/beautifulsoup4 |    4.5.1 | conda           | linux-64, win-64 | py35_0
                                              : Python library designed for screen-scraping
         archiarm/beautifulsoup4   |    4.7.0 | conda           | linux-aarch64   | py27_1000, py36_1000, py37_1000
                                              : Python library designed for screen-scraping
         asmeurer/beautifulsoup4   |    4.2.1 | conda           | osx-64          | py26_1, py33_0, py33_1, py27_1, py27_0
                                              : http://www.crummy.com/software/BeautifulSoup/bs4/
         auto/beautifulsoup4       |    4.3.2 | conda           | linux-64, linux-32, osx-64 | py27_0
                                              : Screen-scraping library
         c4aarch64/beautifulsoup4  |    4.6.3 | conda           | linux-aarch64   | py37_0
                                              : Python library designed for screen-scraping
         c4armv7l/beautifulsoup4   |    4.7.1 | conda           | linux-armv7l    | py37_1001
                                              : Python library designed for screen-scraping
         cdat-forge/beautifulsoup4 |    4.8.1 | conda           | linux-64, osx-64 | py27_0
                                              : Python library designed for screen-scraping
         conda-forge/beautifulsoup4 |    4.8.2 | conda           | linux-ppc64le, linux-64, win-32, linux-aarch64, osx-64, win-64 | py37_0, py36_1001, py36_1000, py37_1000, py37_1001, py27_0, py36_0, py38_0, py36h9f0ad1d_1, py37hc8dfbb8_1, py36hc560c46_1, py27_1001, py27_1000, py38h32f6830_1, py35_0
                                              : Python library designed for screen-scraping
         conner_org/beautifulsoup4 |    4.7.1 | conda           | linux-64        | py27_1
                                              : Python library designed for screen-scraping
         daf/beautifulsoup4        |    4.3.2 | conda           | linux-64        | py27_0
                                              : Screen-scraping library
         draikes/beautifulsoup4    |    4.4.1 | conda           | win-64          | py27_0
                                              : UNKNOWN
         ericmjl/beautifulsoup4    |    4.4.0 | conda           | linux-64, osx-64 | py34_0
                                              : Screen-scraping library
         free/beautifulsoup4       |    4.6.0 | conda           | linux-ppc64le, linux-64, win-32, osx-64, linux-32, win-64 | py36_0, py34_0, py35_0, py27_0
                                              : Python library designed for screen-scraping
         iilab/beautifulsoup4      |    4.3.2 | conda           | linux-64, osx-64 | py34_0
                                              : Screen-scraping library
         ijstokes/beautifulsoup4   |    4.3.2 | conda           | linux-64        | py27_0
         jetson-tx2/beautifulsoup4 |    4.6.0 | conda           | noarch          | py_0
                                              : Python library designed for screen-scraping
         jjhelmus/beautifulsoup4   |          | conda           | linux-aarch64   | py37_0
                                              : Python library designed for screen-scraping
         jmatsushita/beautifulsoup4 |    4.3.2 | conda           | linux-64        | py34_0
                                              : Screen-scraping library
         josh/beautifulsoup4       |    4.3.2 | conda           | win-64          | py34_0
                                              : Screen-scraping library
         main/beautifulsoup4       |    4.8.2 | conda           | linux-ppc64le, linux-64, win-32, osx-64, linux-32, win-64 | py37_0, py37_1, py36h6ea3382_0, py36_1, py27hdc1f29e_0, py35hb75f182_1, py27h3f86ba9_1, py27_1, py27_0, py27hc287451_1, py27h9416283_1, py36_0, py35h61fcdcc_1, py36h49b8c8c_1, py38_0, py36h4361f19_1, py27h8bb5803_1, py35h94b83b4_1, py35h50ea147_0, py35_0, py36h72d3c9f_1, py36hd4cc5e8_1, py35h442a8c9_1, py35_1
                                              : Python library designed for screen-scraping
         manmadescience/beautifulsoup4 |    4.3.2 | conda           | win-64          | py34_0
                                              : Screen-scraping library
         moghimis/beautifulsoup4   |    4.3.2 | conda           | linux-32        | py27_0
                                              : Beautiful Soup sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree
         moustik/beautifulsoup4    |    4.5.0 | conda           | linux-64        | py27_0
         ngould/beautifulsoup4     |    4.2.1 | conda           | osx-64          | py27_0
                                              : http://www.crummy.com/software/BeautifulSoup/bs4/
         pdrops/beautifulsoup4     |    4.3.2 | conda           | osx-64          | py27_0
                                              : Screen-scraping library
         prkrekel/beautifulsoup4   |    4.3.2 | conda           | win-64          | py27_0
                                              : Screen-scraping library
         prometeia/beautifulsoup4  |    4.8.2 | conda           | linux-ppc64le, linux-64, linux-aarch64, win-64, osx-64 | py37_0, py36_0, py38_0, py27_0
                                              : Python library designed for screen-scraping
         rmcgibbo/beautifulsoup4   |    4.3.2 | conda           | linux-64        | py27_0
                                              : Screen-scraping library
         rodgomesc/pip-beautifulsoup4 |    4.5.3 | conda           | noarch          | 0
                                              : Screen-scraping library Built for Android and iOS apps using enaml-native.
         rogerramos/beautifulsoup4 |    4.6.0 | conda           | linux-64        | py27_3
                                              : Screen-scraping library
         rpi/beautifulsoup4        |    4.6.3 | conda           | linux-armv6l, linux-armv7l, noarch | py27_1, py27_0, py36_1, py36_0, py_0, py35_0, py35_1
                                              : Python library designed for screen-scraping
         rpi64/beautifulsoup4      |    4.6.3 | conda           | linux-aarch64   | py36_0
                                              : Python library designed for screen-scraping
         rsmulktis/beautifulsoup4  |    4.5.3 | conda           | linux-armv7l    | py34_0
                                              : Screen-scraping library
         sayth/beautifulsoup4      |    4.4.0 | conda           | win-64          | py34_3
                                              : This is a simple meta-package
         sundarv/beautifulsoup4    |    4.5.3 | conda           | win-64          | py36_0
                                              : Python library designed for screen-scraping
         sunpy/beautifulsoup4      |    4.8.2 | conda           | linux-ppc64le, linux-64, win-32, linux-aarch64, osx-64, win-64 | py37_0, py36_1001, py36_1000, py37_1000, py37_1001, py27_0, py36_0, py38_0, py36h9f0ad1d_1, py37hc8dfbb8_1, py36hc560c46_1, py27_1001, py27_1000, py38h32f6830_1, py35_0
                                              : Python library designed for screen-scraping
         syllabs_admin/beautifulsoup4 |    4.6.0 | conda           | linux-64        | py27h490011d_0
         tbalaburkina/beautifulsoup4 |    4.6.0 | conda           | zos-z           | py37_0
         test_org_002/beautifulsoup4 |    4.5.3 | conda           | []              | py36_0, py27_0, py35_0, py34_0
         travis/beautifulsoup4     |    4.3.2 | conda           | linux-64, osx-64 | py27_0
                                              : Beautiful Soup sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree.
         ulmo/beautifulsoup4       |    4.3.2 | conda           | linux-64, win-32, osx-64, linux-32, win-64 | py27_0
                                              : Screen-scraping library
         wakari1/beautifulsoup4    |    4.3.2 | conda           | linux-64        | py27_0
         ziebel/beautifulsoup4     |    4.4.0 | conda           | linux-64        | py34_0
                                              : Screen-scraping library
         zoeith/beautifulsoup4     |    4.3.2 | conda           | osx-64          | py27_0
                                              : Screen-scraping library
    Found 54 packages
    
    Run 'anaconda show <USER/PACKAGE>' to get installation details

    我选择的版本是conda-forge/beautifulsoup4,在命令行中输入:

    conda install -c https://conda.anaconda.org/conda-forge beautifulsoup4, 注意conda-forge和beautifulsoup4之间没有“/”。

  • 相关阅读:
    统计学(第六版)14单元——学习总结
    统计学(第六版)13单元——学习总结(时间序列分析总结)
    统计学(第六版)11到12单元——学习总结
    Kubernetes: 微内核的分布式操作系统
    彻底搞懂JavaScript之原型
    手把手带你玩转k8s-一键部署vue项目
    新一代缓存Caffeine,速度确实比Guava的Cache快
    理解 Es6 中的 Symbol 类型
    一天一大 leet(用两个栈实现队列)难度:简单 DAY-30
    (Java 源码阅读) 春眠不觉晓,HashMap知多少
  • 原文地址:https://www.cnblogs.com/liliwang/p/12637666.html
Copyright © 2011-2022 走看看