zoukankan      html  css  js  c++  java
  • python爬虫学习(3)_模拟登陆

    1.登陆超星慕课,chrome抓包,模拟header,提取表单隐藏元素构成params。

      主要是验证码图片地址,在js中发现由js->new Date().getTime()时间戳动态生成url,python对应time.time(),生成验证码图片url,图片下载在本地,手动输入。代码如下:

      

    #coding=utf-8
    import requests
    import time
    from bs4 import BeautifulSoup
    header={
             'Referer':'http://aust.fanya.chaoxing.com/portal',
             'Upgrade-Insecure-Requests':'1',
             'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36        (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'
    }
    name=raw_input("input name:")
    password=raw_input("input password:")
    num=int(time.time()) #时间戳,取整
    code_url='http://passport2.chaoxing.com/num/code/?'+str(num) #图片url
    session=requests.Session()
    r=session.get(code_url)
    image=r.content
    with open('/home/zhanyunwu/code.jpg','wb') as f:
        f.write(image)
    numcode=raw_input("input code:")
    #post的参数
    params={
        'refer_0x001':'http%3A%2F%2Fi.mooc.chaoxing.com%2Fspace%2Findex.shtml',
        'pid':'1',
        'pidName':'',
        'fid':'12007',
        'fidName':'安徽理工大学',
        'allowJoin':'0',
        'isCheckNumCode':'1',
        'f':'0',
        'uname':name,
        'password':password,
        'numcode':numcode
    }
    url='http://passport2.chaoxing.com/login' #form提交的url
    req=session.post(url,params,headers=header)
    courses=session.get('http://mooc12.chaoxing.com/visit/courses',cookies=req.cookies,headers=header) #通过成功登陆的cookie访问其他页面    
    

     2.浏览器已成功登陆,通过保存的cookie登陆豆瓣

     

    #coding=utf-8
    import requests
    session=requests.Session()
    cookie={}
    allcookie='ll="118190"; bid=c3kC6ui9q28; _pk_id.100001.8cb4=4c5ed6a80ede35ed.1471684466.1.1471684546.1471684466.; _pk_ses.100001.8cb4=*; __utma=30149280.794301906.1471684473.1471684473.1471684473.1; __utmb=30149280.2.9.1471684473; __utmc=30149280; __utmz=30149280.1471684473.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmt=1; dbcl2="140658732:f1Vx65Uloqc"; ck=FGYf; push_noty_num=0; push_doumail_num=0; _vwo_uuid_v2=0B4AF16F37C54670B861F7D7A7C5B679|5b7205084917bf0bf6bd9380a8224a9d'
    for c in allcookie.split(";"):
        key,value=c.split("=",1)
        cookie[key]=value
    s=session.get('http://www.douban.com/people/140658732/',cookies=cookie)
    print s.content
    text=s.content
    with open("/home/zhanyunwu/test.html","wb") as f1:
       f1.write(text)
    

      

      

  • 相关阅读:
    form的get与post方式的区别(转)
    html中DIV+CSS与TABLE布局方式的区别及HTML5新加入的结构标签(转)
    HTML简介
    数据库设计 三范式
    索引与视图
    算法训练 连续正整数的和
    算法训练 寂寞的数
    算法训练 学做菜
    算法训练 猴子分苹果
    算法训练 A+B problem
  • 原文地址:https://www.cnblogs.com/yunwuzhan/p/5791200.html
Copyright © 2011-2022 走看看