zoukankan      html  css  js  c++  java
  • BeautifulSoup_python3

    1.错误排除

    bsObj = BeautifulSoup(html.read())

    报错:

     UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

    解决办法:

    bsObj = BeautifulSoup(html.read(),"html.parser")

    BeautifulSoup

    简介:通过定位HTML标签来格式化和组织复杂的网络信息,用简单的python对象来展现XML结构信息。

    python3 安装 版本4  BeautifulSoup4 (BS4) 

    运行实例:

     1 #!/usr/bin/env python
     2 # encoding: utf-8
     3 """
     4 @author: 侠之大者kamil
     5 @file: beautifulsoup.py
     6 @time: 2016/4/19 16:36
     7 """
     8 from bs4 import BeautifulSoup
     9 from urllib.request import urlopen
    10 html = urlopen('http://www.cnblogs.com/kamil/')
    11 print(type(html))
    12 bsObj = BeautifulSoup(html.read(),"html.parser") #html.read() 获取网页内容,并且传输到BeautifulSoup 对象。
    13 print(type(bsObj))
    14 print(bsObj.h1)

     第12 行注意,需要加上 "html.parser"

    结果:

    ssh://kamil@xzdz.hk:22/usr/bin/python3 -u /home/kamil/windows_python3/python3/Day11/day12/beautifulsoup.py
    <class 'http.client.HTTPResponse'>
    <class 'bs4.BeautifulSoup'>
    <h1><a class="headermaintitle" href="http://www.cnblogs.com/kamil/" id="Header1_HeaderTitle">侠之大者kamil</a></h1>
    
    Process finished with exit code 0

     官方文档

    公众号请关注:侠之大者
  • 相关阅读:
    TCP和UDP的最完整的区别
    cluster模块实现多进程-让我的代理服务速度飞起来了
    redis多实例运行
    Nodejs实现代理服务器配置
    java统计程序运行的时间
    spring boot配置写法
    Redis: OOM command not allowed when used memory > ‘maxmemory
    最新版postgresql+pgboucer安装
    spring boot 数据库连接池配置
    Spring BOOT PERFORMANCE
  • 原文地址:https://www.cnblogs.com/kamil/p/5408986.html
Copyright © 2011-2022 走看看