python爬虫（八） requests库之 get请求 - 走看看

zoukankan html css js c++ java

python爬虫（八） requests库之 get请求
requests库比urllib库更加方便，包含了很多功能。

1、在使用之前需要先安装pip，在pycharm中打开：

写入pip install requests命令，即可下载

在github中有关于requests库的介绍，网址：https://github.com/requests/requests

2、Get请求
response=requests.get("https://www.baidu.com/")
我们要完成在百度的页面获取中国的相关信息，相当于

输入中国：

用爬虫代码实验实现：
import requests # wd是在网址中后面的一段 params={ 'wd':'中国' } headers={ 'User-Agent':"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36" } # 这时我们要在这个网址中加入S response=requests.get("http://www.baidu.com/s",params=params,headers=headers) with open('baidu.html','w',encoding='utf-8') as fp: fp.write(response.content.decode('utf-8'))
打开后就是中国的相关信息：

3、response.txt和response.content的区别

response.txt是 requests是经response.content解码的字符串，requests会根据自己的猜测来进行解码，有时候会猜测错误，导致乱码。

response.content是直接从网上爬取的数据，没有经过经过任何解码，是bytes类型。

所以最常用的就是：response.content.decode('utf-8')
查看全文

相关阅读:
maven安装和配置
 maven的安装和配置
 mac上pydev
Android自动化----adb shell，appium，uiautomator2
Django
centos操作---搭建环境安装python
Linux系统centos中sudo命令不能用----提升权限
 python---numpy
python-socket
Le x820 的刷机记录

原文地址：https://www.cnblogs.com/zhaoxinhui/p/12374316.html

Copyright © 2011-2022 走看看