zoukankan      html  css  js  c++  java
  • BeautifulSoup_python3

    1.错误排除

    bsObj = BeautifulSoup(html.read())

    报错:

     UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

    解决办法:

    bsObj = BeautifulSoup(html.read(),"html.parser")

    BeautifulSoup

    简介:通过定位HTML标签来格式化和组织复杂的网络信息,用简单的python对象来展现XML结构信息。

    python3 安装 版本4  BeautifulSoup4 (BS4) 

    运行实例:

     1 #!/usr/bin/env python
     2 # encoding: utf-8
     3 """
     4 @author: 侠之大者kamil
     5 @file: beautifulsoup.py
     6 @time: 2016/4/19 16:36
     7 """
     8 from bs4 import BeautifulSoup
     9 from urllib.request import urlopen
    10 html = urlopen('http://www.cnblogs.com/kamil/')
    11 print(type(html))
    12 bsObj = BeautifulSoup(html.read(),"html.parser") #html.read() 获取网页内容,并且传输到BeautifulSoup 对象。
    13 print(type(bsObj))
    14 print(bsObj.h1)

     第12 行注意,需要加上 "html.parser"

    结果:

    ssh://kamil@xzdz.hk:22/usr/bin/python3 -u /home/kamil/windows_python3/python3/Day11/day12/beautifulsoup.py
    <class 'http.client.HTTPResponse'>
    <class 'bs4.BeautifulSoup'>
    <h1><a class="headermaintitle" href="http://www.cnblogs.com/kamil/" id="Header1_HeaderTitle">侠之大者kamil</a></h1>
    
    Process finished with exit code 0

     官方文档

    公众号请关注:侠之大者
  • 相关阅读:
    VUE集成keycloak和Layui集成keycloak
    iscsi基本命令
    Linux网卡bond模式
    Unmount and run xfs_repair
    Centos7 升级过内核 boot分区无法挂载修
    Centos7 误删除bin/sbin之类的恢复
    QSS 记录
    #pragma 小节
    解决Github打不开问题
    判断数据是否在指定区间内
  • 原文地址:https://www.cnblogs.com/kamil/p/5408986.html
Copyright © 2011-2022 走看看