一、Content-Type
Content-Type,内容类型,一般是指网页中存在的Content-Type,用于定义网络文件的类型和网页的编码,决定浏览器将以什么形式、什么编码读取这个文件。Content-Type属性指定请求和响应的HTTP内容类型。如果未指定 ContentType,默认为text/html
二、常见的编码格式
application/x-www-form-urlencoded 格式 --> a=1&b=2
application/json 格式 --> {"a":"2","b":"2"}
multipart/form-data
三、Requests模块
Requests是用python语言基于urllib编写的,采用的是Apache2 Licensed开源协议的HTTP库,Requests它会比urllib更加方便,可以节约我们大量的工作。一句话,requests是python实现的最简单易用的HTTP库,建议爬虫使用requests库。默认安装好python之后,是没有安装requests模块的,需要单独通过pip安装
四、安装
pip install requests
五、Requests.get
requests.get(url, params=None, **kwargs)
url : 网站地址
import requests
url = "https://www.douban.com/"
# GET请求,会返回结果
response = requests.get(url)
# response.text查看结果
print(response.text)
params:参数
import requests
# 请求带参数的网页
response1 = requests.get('https://www.douban.com/search?q=新喜剧之王')
print(response1.text)
# 请求带参数的网页
response2 = requests.get('https://www.douban.com/search',params={"q":"新喜剧之王"})
print(response2.text)
headers:请求头
import requests
# 请求请求头
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"}
response = requests.get("https://www.baidu.com",headers=headers)
print(response.text)
cookies:带cookies的请求
import uuid
import requests
cookies = dict(token=str(uuid.uuid4()))
url = 'http://httpbin.org/cookies'
# 带cookies的请求
response = requests.get(url=url,cookies=cookies)
print(response.text)
'''
{
"cookies": {
"token": "5c4478cd-b987-4103-b224-92fab6cf7b22"
}
}
'''
六、request.post
requests.post(url, data=None, json=None, **kwargs)
data是指application/x-www-form-urlencoded格式的数据
import requests
url = "http://httpbin.org/post"
# 发送application/x-www-form-urlencoded格式的数据,用data
response = requests.post(url,data={"a":1,"b":2})
print(response.text)
'''
{
"args": {},
"data": "",
"files": {},
"form": {
"a": "1",
"b": "2"
},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Connection": "close",
"Content-Length": "7",
"Content-Type": "application/x-www-form-urlencoded",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.21.0"
},
"json": null,
"origin": "121.35.182.63",
"url": "http://httpbin.org/post"
}
'''
json是指application/json格式的数据
import requests
url = "http://httpbin.org/post"
# 发送application/json格式的数据,用json
response = requests.post(url,json={"a":1,"b":2})
print(response.text)
'''
{
"args": {},
"data": "{"a": 1, "b": 2}",
"files": {},
"form": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Connection": "close",
"Content-Length": "16",
"Content-Type": "application/json",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.21.0"
},
"json": {
"a": 1,
"b": 2
},
"origin": "121.35.182.63",
"url": "http://httpbin.org/post"
}
'''