爬虫解决302 问题 - 走看看

zoukankan html css js c++ java

爬虫解决302 问题
　　

一：问题描述：

　　爬虫微博信息，出现302跳转，

　　比如访问的URL是：https://weibo.com/2113535642?refer_flag=1001030103_ （图片中标记为1）

　　然后跳转的URL是：https://weibo.com/sgccjsdl?refer_flag=1001030103_&is_hot=1 （图片中标记为2）

　　　　截图如下所示：

　　　　　　

　　　访问图中标记1 的URL的时候，没有返回任何内容，但是response headers 返回了图中标记2 的URL。如下图所示：

下面是具体实现的代码过程：

　　
　　　　　　　result1 = session.get(url=str(user_url),headers=headers,verify=False,allow_redirects=False) result = result1.content new_requests_url = result1.headers['location'] new_requests_url = "https://weibo.com" + new_requests_url if '<h1 class="username">' not in result: result = session.get(url=str(new_requests_url), headers=headers, verify=False, allow_redirects=False).content
　　　核心代码是获得需要跳转的URL，代码是 new_requests_url = result1.headers['location']

　　　　　　　　
查看全文

相关阅读:
新型监控告警工具prometheus（普罗米修斯）入门使用（附视频讲解）
Nginx、OpenResty和Kong的基本概念与使用方法
 Kubernetes网络方案Flannel的学习笔记
 新型监控告警工具prometheus（普罗米修斯）的入门使用（附视频讲解）
超级账本HyperLedger：Fabric nodejs SDK的使用(附视频讲解)
超级账本HyperLedger：Fabric使用kafka进行区块排序（共识，附视频讲解)
超级账本HyperLedger：Fabric Golang SDK的使用（附视频）
超级账本HyperLedger：Fabric的Chaincode（智能合约、链码）开发、使用演示
 超级账本HyperLedger：Fabric源码走读(一)：源代码阅读环境准备
 超级账本HyperLedger：Fabric从1.1.0升级到1.2.0

原文地址：https://www.cnblogs.com/xuchunlin/p/9688000.html

Copyright © 2011-2022 走看看