zoukankan      html  css  js  c++  java
  • 验证码操作

    #下载验证码到本地
    import requests
    from lxml import etree
    import urllib.request
    
    s = requests.session()
    
    url = 'https://so.gushiwen.org/user/login.aspx?from=http://so.gushiwen.org/user/collect.aspx'
    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36'
    }
    
    page_content = s.get(url,headers=headers).text
    
    tree = etree.HTML(page_content)
    img_url = tree.xpath('//img[@id="imgCode"]/@src')[0]
    img_url = 'https://so.gushiwen.org' + img_url
    
    #urllib.request.urlretrieve(img_url,'./code.png')
    img_data = s.get(url=img_url,headers=headers).content
    with open('./code.png','wb') as fp:
        fp.write(img_data) 
    
    #解析form表单的令牌数据(反爬手段)
    key_1 = tree.xpath('//*[@id="__VIEWSTATE"]/@value')[0]
    key_2 = tree.xpath('//*[@id="__VIEWSTATEGENERATOR"]/@value')[0]
    
    #step 2:抓包获取登录url
    code = input('查看下验证码内容:')
    post_url = 'https://so.gushiwen.org/user/login.aspx?from=http%3a%2f%2fso.gushiwen.org%2fuser%2fcollect.aspx'
    data = {
        "__VIEWSTATE":key_1,
        "__VIEWSTATEGENERATOR":key_2,
        "from":"http://so.gushiwen.org/user/collect.aspx",
        "email":"www.zhangbowudi@qq.com",
        "pwd":"123456",
        "code":code,
        "denglu":"登录"
    }
    page_content = s.post(url=post_url,headers=headers,data=data).text
    
    with open('./second.html','w',encoding='utf-8') as fp:
        fp.write(page_content)
    print('over')
  • 相关阅读:
    现代软件工程系列 学生的精彩文章 (5) 其实还是人的问题
    4层结构
    Spring Rich Client Project
    有关“理想”与“现实”的两篇文章
    TechEd归来
    Domain Model
    一次Java出错体验
    真心感谢热心帮助我的朋友
    Tapestry & Groovy
    采用 Domain Model 的架构设计的简单问答
  • 原文地址:https://www.cnblogs.com/xujinjin18/p/9716146.html
Copyright © 2011-2022 走看看