爬虫学习之-requests乱码 - 走看看

zoukankan html css js c++ java

爬虫学习之-requests乱码
总体功能的一个演示
import requests response = requests.get("https://www.baidu.com") print(type(response)) print(response.status_code) print(type(response.text)) print(response.text) print(response.cookies) print(response.content) print(response.content.decode("utf-8"))
我们可以看出response使用起来确实非常方便，这里有个问题需要注意一下：
很多情况下的网站如果直接response.text会出现乱码的问题，所以这个使用response.content
这样返回的数据格式其实是二进制格式，然后通过decode()转换为utf-8，这样就解决了通过response.text直接返回显示乱码的问题.

请求发出后，Requests 会基于 HTTP 头部对响应的编码作出有根据的推测。当你访问 response.text 之时，Requests 会使用其推测的文本编码。你可以找出 Requests 使用了什么编码，并且能够使用 response.encoding 属性来改变它.如：
response =requests.get("http://www.baidu.com") response.encoding="utf-8" print(response.text)
不管是通过response.content.decode("utf-8)的方式还是通过response.encoding="utf-8"的方式都可以避免乱码的问题发生
查看全文

相关阅读:
将文献的bibtex引用格式批量转换为bibitem格式参考文献
 ubuntu下webbench作网站压力测试教程【webbench安装】
Windows10安装虚拟机VMware并且安装ubuntu16系统
 ubuntu 16.04系统下解决MySQL 的root用户重置密码问题
 elementui 中日期时间插件结束时间大于开始时间
 SqlDbType 与 .Net 数据类型对照表
 可用的datatable转换成List<T>
【beyond compare4 秘钥】亲测4.1.6可用
 winform 自定义控件圆按钮插件
 net framework 4.0 wcf发布到IIS

原文地址：https://www.cnblogs.com/brady-wang/p/9699579.html

Copyright © 2011-2022 走看看