zoukankan      html  css  js  c++  java
  • BeautifulSoup_python3

    1.错误排除

    bsObj = BeautifulSoup(html.read())

    报错:

     UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

    解决办法:

    bsObj = BeautifulSoup(html.read(),"html.parser")

    BeautifulSoup

    简介:通过定位HTML标签来格式化和组织复杂的网络信息,用简单的python对象来展现XML结构信息。

    python3 安装 版本4  BeautifulSoup4 (BS4) 

    运行实例:

     1 #!/usr/bin/env python
     2 # encoding: utf-8
     3 """
     4 @author: 侠之大者kamil
     5 @file: beautifulsoup.py
     6 @time: 2016/4/19 16:36
     7 """
     8 from bs4 import BeautifulSoup
     9 from urllib.request import urlopen
    10 html = urlopen('http://www.cnblogs.com/kamil/')
    11 print(type(html))
    12 bsObj = BeautifulSoup(html.read(),"html.parser") #html.read() 获取网页内容,并且传输到BeautifulSoup 对象。
    13 print(type(bsObj))
    14 print(bsObj.h1)

     第12 行注意,需要加上 "html.parser"

    结果:

    ssh://kamil@xzdz.hk:22/usr/bin/python3 -u /home/kamil/windows_python3/python3/Day11/day12/beautifulsoup.py
    <class 'http.client.HTTPResponse'>
    <class 'bs4.BeautifulSoup'>
    <h1><a class="headermaintitle" href="http://www.cnblogs.com/kamil/" id="Header1_HeaderTitle">侠之大者kamil</a></h1>
    
    Process finished with exit code 0

     官方文档

    公众号请关注:侠之大者
  • 相关阅读:
    待学习资料
    Hive之数据类型
    Hive 之元数据库的三种模式
    Hive之数据模型
    311 jvm类加载以及对象回收相关
    221 netty模型相关
    J101
    213 NIO编程
    XXLJOB终止定时任务的犯二小故事
    XXL-JOB源码研究(1)---version 2.1.2
  • 原文地址:https://www.cnblogs.com/kamil/p/5408986.html
Copyright © 2011-2022 走看看