python requests库的简单使用

zoukankan html css js c++ java

python requests库的简单使用
requests是python的一个HTTP客户端库，跟urllib，urllib2类似，但比urllib，urllib2更加使用简单。

1. requests库的安装
在你的终端中运行pip安装命令即可
pip install requests
　　

使用源码安装
git clone git://github.com/kennethreitz/requests.git python setup.py install
　　

2. requests发送请求
使用 Requests 发送网络请求
import requests req0 = requests.get("https://www.baidu.com") print (req0)
　　

# 发送一个 HTTP POST 请求
req1 = requests.post("https://http://httpbin.org/post")
　　

# 发送PUT，DELETE，HEAD 以及 OPTIONS 请求
requests.put("http://http://httpbin.org/put") requests.delete("http://http://httpbin.org/delete") requests.head("http://http://httpbin.org/get") requests.options("http://http://httpbin.org/get")
　　

3. 传递URL 参数
Requests 使用 params 关键字参数，以一个字典来提供这些参数
payload = {'key1': 'value1', 'key2': 'value2'} req2 = requests.get("https://httpbin.org/get",params=payload) print (req2.url) # https://httpbin.org/get?key2=value2&key1=value1
　　

# 注：字典里值为 None 的键都不会被添加到 URL 的查询字符串里

将一个列表作为值传入
payload = {'key1': 'value1', 'key2': ['value2', 'value3']} req3 = requests.get('http://httpbin.org/get', params=payload) print (req3.url) # http://httpbin.org/get?key2=value2&key2=value3&key1=value1
　　

4. 响应内容
requests 读取服务器响应的内容
import requests req4 = requests.get("https://www.baidu.com") print (req4.text)
　　

Requests 会自动解码来自服务器的内容。大多数 unicode 字符集都能被无缝地解码,使用req4.encoding 属性可以查询编码格式和改变编码类型
>>> req4.encoding 'ISO-8859-1' >>> req4.encoding = 'utf-8' >>> req4.encoding 'utf-8'
　　

5. 二进制响应内容
Requests 会自动为你解码 gzip 和 deflate 传输编码的响应数据
print (req4.content)
　　

6. JSON 响应内容
Requests 中也有一个内置的 JSON 解码器，帮助处理 JSON 数据
import requests req6 = requests.get("https://github.com/timeline.json") print (req6.json)
　　

# 如果 JSON 解码失败， r.json 就会抛出一个异常。例如，相应内容是 401 (Unauthorized)，尝试访问 r.json 将会抛出 ValueError: No JSON object could be decoded 异常

7. 原始响应内容
想获取来自服务器的原始套接字响应，可以访问 r.raw. 不过先要确保在初始请求中设置了 stream=True
req7 = requsets.get('https://github.com/timeline.json',stream = True) print (req7.raw) print (req7.raw.read(10))
　　

8. 定制请求头
为请求添加 HTTP 头部，只要简单地传递一个 dict 给 headers 参数就可以了
url = 'https://api.github.com/some/endpoint' headers = {'user-agent': 'my-app/0.0.1'} req8 = requests.get(url, headers=headers)
　　

注：注意: 所有的 header 值必须是 string、bytestring 或者 unicode

9. 更加复杂的 POST 请求
发送一些编码为表单形式的数据——非常像一个 HTML 表单。要实现这个，只需简单地传递一个字典给 data 参数。数据字典在发出请求时会自动编码为表单形式
payload = {'key1':'value1','key2':'value2'} req9 = requests.post("http://httpbin.org/post", data=payload) print (req9.text) ''' { "args": {}, "data": "", "files": {}, "form": { "key1": "value1", "key2": "value2" }, "headers": { "Accept": "*/*", "Accept-Encoding": "gzip, deflate", "Connection": "close", "Content-Length": "23", "Content-Type": "application/x-www-form-urlencoded", "Host": "httpbin.org", "User-Agent": "python-requests/2.6.0 CPython/2.7.5 Linux/3.10.0-327.36.1.el7.x86_64" }, "json": null, "origin": "112.35.10.78", "url": "http://httpbin.org/post" } '''
　　

传递一个 string 而不是一个 dict，那么数据会被直接发布出去
import json url = 'https://api.github.com/some/endpoint' payload = {'some':'data'} req10 = requests.post(url,data=json.dumps(payload)) print (req10) # {"message":"Not Found","documentation_url":"https://developer.github.com/v3"}
　　

10. POST一个多部分编码(Multipart-Encoded)的文件

Requests 使得上传多部分编码文件变得很简单
url = 'http://httpbin.org/post' files = {'file': ('report.csv', 'some,data,to,send another,row,to,send ')} r = requests.post(url, files=files) r.text ''' { ... "files": { "file": "some,data,to,send\nanother,row,to,send\n" }, ... } '''
　　

11. 响应状态码
检测响应状态码
req11 = requests.get('http://httpbin.org/get') req11.status_code # 200
　　

Requests还附带了一个内置的状态码查询对象
req11.status_code == requests.codes.ok # True
　　

12. 响应头
查看以一个 Python 字典形式展示的服务器响应头
req11.headers ''' {'content-length': '311', 'via': '1.1 vegur', 'x-powered-by': 'Flask', 'server': 'meinheld/0.6.1', 'connection': 'keep-alive', 'x-processed-time': '0.000663995742798', 'access-control-allow-credentials': 'true', 'date': 'Fri, 19 May 2017 12:38:25 GMT', 'access-control-allow-origin': '*', 'content-type': 'application/json'} '''
　　

注：响应头的字典比较特殊：它是仅为 HTTP 头部而生的。HTTP 头部是大小写不敏感的

13. Cookie
如果响应中包含一些 cookie，你可以快速访问它们
url = 'http://example.com/some/cookie/setting/url' req13 = requests.get(url) req13.cookies['example_cookie_name'] # 'example_cookie_name'
　　

发送你的cookies到服务器，可以使用 cookies 参数
url = 'http://httpbin.org/cookies' cookies = dict(cookies_are='working') req14 = requests.get(url,cookies=cookies) print (req14.text) ''' { "cookies": { "cookies_are": "working" } } '''
　　

14. 重定向与请求历史
使用响应对象的 history 方法来追踪重定向
Response.history 是一个 Response 对象的列表，为了完成请求而创建了这些对象。这个对象列表按照从最老到最近的请求进行排序
req15 = requests.get('http://github.com') print (req15.url) # https://github.com/ print (req15.status_code) # 200 print (req15.history) # [<Response [301]>]
　　

如果使用的是GET、OPTIONS、POST、PUT、PATCH 或者 DELETE，那么可以通过 allow_redirects 参数禁用重定向处理
req16 = requests.get('http://github.com',allow_redirects=False) print (req16.status_code) # 301 print (req16.history) # []
　　

使用了 HEAD，可以启用重定向
req17 = requests.head('http://github.com',allow_redirects=True) print (req17.url) # https://github.com/ print (req17.history) # [<Response [301]>]
　　

15. 超时
requests 在经过以 timeout 参数设定的秒数时间之后停止等待响应
注：timeout 仅对连接过程有效，与响应体的下载无关。 timeout 并不是整个下载响应的时间限制，而是如果服务器在 timeout 秒内没有应答，将会引发一个异常（更精确地说，是在 timeout 秒内没有从基础套接字上接收到任何字节的数据时）

16. 错误与异常
(1) 遇到网络问题（如：DNS 查询失败、拒绝连接等）时，Requests 会抛出一个 ConnectionError 异常。
(2) 如果 HTTP 请求返回了不成功的状态码， Response.raise_for_status() 会抛出一个 HTTPError 异常。
(3) 若请求超时，则抛出一个 Timeout 异常。
(4) 若请求超过了设定的最大重定向次数，则会抛出一个 TooManyRedirects 异常。
(5) 所有Requests显式抛出的异常都继承自 requests.exceptions.RequestException 。

参考链接：http://docs.python-requests.org/zh_CN/latest/user/quickstart.html
查看全文

相关阅读:
Apache日志分析
 iptables日志探秘
 php与其他一些相关工具的安装步骤分享
 ERROR 1 (HY000): Can't create/write to file '/tmp/#sql_830_0.MYI' (Errcode: 13)
一些可能需要的正则
 restful api的简单理解
 认识MySQL Replication
如何处理缓存失效、缓存穿透、缓存并发等问题
 经典算法mark
php常用的一些代码

原文地址：https://www.cnblogs.com/xieshengsen/p/6886164.html