移动端数据爬取
安装fiddler
真机安装fiddler证书
修改手机代理(改成电脑ip,端口设置为fiddler的端口)
上述设置完成后我们就可以使用fiddler抓取手机端的数据了
夜神手机模拟器
移动端数据采集-案例一
找到数据接口
# -*- coding: utf-8 -*-
import requests
from lxml import etree
import json
url = "https://api.douguo.net/recipe/v2/search/0/20"
headers = {
"User-Agent":"Mozilla/5.0 (Linux; Android 5.1.1; LIO-AN00 Build/LIO-AN00; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/74.0.3729.136 Mobile Safari/537.36",
"Cookie":"duid=65861350",
"uuid": "feccc21d-d04b-466c-b276-98c6a7e1acef",
"Host":"api.douguo.net",
"language":"zh"
}
data = {
"client": "4",
"_session": "1599663959153866174309718910",
"keyword": "下饭菜",
"order": "0",
"_vs": "400",
"type": "0",
"auto_play_mode": "2",
"sign_ran": "9ce91f215449bf78a75a4a147d6bcc43",
}
response = requests.post(url=url,headers=headers,data=data).text
response2 = json.loads(response)
print(response2)
最后我只需要使用字典提取自己需要的数据就行
带翻页源码
# -*- coding: utf-8 -*-
import requests
from lxml import etree
import json
#通过滑动app分析得知每页 20递增
#第一页 https://api.douguo.net/recipe/v2/search/0/20
# 第二页 https://api.douguo.net/recipe/v2/search/20/20
# 第三页 https://api.douguo.net/recipe/v2/search/40/20
# 第四页 https://api.douguo.net/recipe/v2/search/60/20
# 依次递增
# 定义通用翻页模板
url = "https://api.douguo.net/recipe/v2/search/%d"+"/20"
for pg in range(0,100,20):
new_url = format(url%pg)
headers = {
"User-Agent":"Mozilla/5.0 (Linux; Android 5.1.1; LIO-AN00 Build/LIO-AN00; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/74.0.3729.136 Mobile Safari/537.36",
"Cookie":"duid=65861350",
"uuid": "feccc21d-d04b-466c-b276-98c6a7e1acef",
"Host":"api.douguo.net",
"language":"zh"
}
data = {
"client": "4",
"_session": "1599663959153866174309718910",
"keyword": "下饭菜",
"order": "0",
"_vs": "400",
"type": "0",
"auto_play_mode": "2",
"sign_ran": "9ce91f215449bf78a75a4a147d6bcc43",
}
response = requests.post(url=new_url,headers=headers,data=data).text
response2 = json.loads(response)
print(response2)
1,踩点app- —2,分析app登录流程-----3,账号密码/手机短信----4,图像验证码----5,短信验证码-----6,分析登录接口----7,接口参数/加密算法—8,伪造登录请求------9,获取登录状态/权限操作/后续扩展。