zoukankan      html  css  js  c++  java
  • Python3 使用requests库读取本地保存的cookie文件实现免登录访问

    1.  读取selenium模块保存的本地cookie文件来访问知乎

    读取http://www.cnblogs.com/strivepy/p/9233389.html保存的本地cookie来访问知乎的用户设置界面,用selenium保存下来的json文件如下格式:

    1 [{"domain": "www.zhihu.com", "expiry": 1527855266.402958, "httpOnly": false, "name": "tgw_l7_route", "path": "/", "secure": false, "value": "200d77f3369d188920b797ddf09ec8d1"},
    2  {"domain": ".zhihu.com", "expiry": 1622462366.40309, "httpOnly": false, "name": "d_c0", "path": "/", "secure": false, "value": ""AFAkkY_hrg2PTvLVtweW-Ok8mRLKop4IJZY=|1527854371""}, 
    3  {"domain": ".zhihu.com", "httpOnly": false, "name": "_xsrf", "path": "/", "secure": false, "value": "7da6b4e4-c77d-47a4-81fa-68b1262235c8"}....后面的删掉了]

    包含很多用不到的信息,比如pathsecure等,在读取cookie只需要读取每个cookienamevalue属性。代码放在名为zhihu.py模块中:

     1 # -*- coding: utf-8 -*-
     2 
     3 import requests
     4 import json
     5 import os
     6 from requests.cookies import RequestsCookieJar
     7 
     8 
     9 def parse_index():
    10     url = 'https://www.zhihu.com/settings/account'
    11     headers = {
    12         'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.146 Safari/537.36'
    13     }
    14     cookies = getcookies_decode_to_dict()
    15     # cookies = getcookies_decode_to_cookiejar()
    16     # requests.get()方法的cookies参数只接收Dict或者CookieJar对象
    17     response = requests.get(url=url, headers=headers, cookies=cookies)
    18     print(response.url)
    19     print(response.text)
    20 
    21 
    22 def getcookies_decode_to_dict():
    23     path = os.getcwd() + '/cookies/'
    24     if not os.path.exists(path):
    25         print('Cookie文件不存在,请先运行cookiesload.py')
    26     else:
    27         cookies_dict = {}
    28         with open(path + 'cookies.txt', 'r') as f:
    29             cookies = json.loads(f.read())
    30             for cookie in cookies:
    31                 cookies_dict[cookie['name']] = cookie['value']
    32             return cookies_dict
    33 
    34 
    35 def getcookies_decode_to_cookiejar():
    36     path = os.getcwd() + '/cookies/'
    37     if not os.path.exists(path):
    38         print('Cookie文件不存在,请先运行cookiesload.py')
    39     else:
    40         cookiejar = RequestsCookieJar()
    41         with open(path + 'cookies.txt', 'r') as f:
    42             cookies = json.loads(f.read())
    43             for cookie in cookies:
    44                 cookiejar.set(cookie['name'], cookie['value'])
    45             return cookiejar
    46 
    47 
    48 if __name__ == '__main__':
    49     parse_index()

    获取到的源码显示,已经成功抓取到知乎用户设置页面。

  • 相关阅读:
    live555源码研究(三)------UsageEnvironment类
    live555源码研究(二)------TaskScheduler类
    live555源码研究(一)------live555MediaServer的启动过程和基本类图
    (转)视频监控相关文章
    【流媒體】live555—VS2008 下live555编译、使用及测试
    【转】PostgreSQL IP地址访问配置
    red5研究(一):下载,工程建立、oflaDemo安装、demo测试
    SVN服务器的搭建和使用
    【转】linux下cvs配置
    【转】js正则表达式语法
  • 原文地址:https://www.cnblogs.com/strivepy/p/9233437.html
Copyright © 2011-2022 走看看