zoukankan      html  css  js  c++  java
  • 爬虫模拟登录破解无原图滑动验证码

    模拟登录对象:博客园

    验证码类型:无原图滑动验证码

    使用工具与模块:python,selenium 

    浏览器:Chrome

    大体思路:以前的滑动验证码多为有原图的验证码,可以通过Image模块截取两张不同的图,通过对比像素得出移动的距离,无原图验证码也是基于这个原理,只是多了一步找出原图,该操作可以通过driver.execute_script()添加JS代码,改变display显示获得原图,然后就变成了有原图的滑动验证码的操作流程。

    具体思路:

    第一步:输入账号、密码,然后点击登陆

       from selenium import webdriver
       #为了方便演示与查看结果,在此使用有界面的Chrome浏览器,成功之后可以换成无界面浏览器
        driver=webdriver.Chrome()
        #参数为博客园登录页面
        driver.get('https://account.cnblogs.com/signin')
        #隐式等待3秒
        driver.implicitly_wait(3)
        #找到用户名标签和密码标签,用ID查找
        input_username=driver.find_element_by_id('LoginName')
        input_password=driver.find_element_by_id('Password')
        #输入用户名和密码
        input_username.send_keys('11111111111')
        input_password.send_keys('xxxxxxxxxx')
        #找到提交按钮
        submitBtn=driver.find_element_by_id('submitBtn')
        #点击提交
        submitBtn.click()

    效果如图所示:

    第二步:弹出有缺口的图,并截取

    找到该标签,通过xpath查找找到位置,(通过classname查找,可能会报错,原因未知),这个位置不仅是缺口图的位置,还是原图的位置,所以获取原图和缺口图的方式是一样的

    先写一个截图函数:

    from PIL import Image
    def get_snap(driver):
        #创建一个空的图片文件
        driver.save_screenshot('snap.png')
        snap_obj=Image.open('snap.png')
        return snap_obj
    def get_image(driver):
        #通过xpath找到元素
        img_element = driver.find_element_by_xpath(
            '//div[@class="geetest_panel_next"]//canvas[@class="geetest_canvas_slice geetest_absolute"]')
        #获得图片的大小和位置
        size = img_element.size
        location = img_element.location
        left=location['x']
        top=location['y']
        right=left+size['width']
        bottom=top+size['height']
        snap_obj=get_snap(driver)
        #注意该参数是元组
        img_obj=snap_obj.crop((left,top,right,bottom))
        return img_obj

    通过获得的left,top,right,bottom进行截图

    第三步:通过JS代码,显示原图

     找到该便签,改变style中的display,其值为block时显示的是无缺口图:

    现在通过代码改变该标签的值:

    driver.execute_script("var x=document.getElementsByClassName('geetest_canvas_fullbg geetest_fade geetest_absolute')[0];"
                              "x.style.display='block';"
                              "x.style.opacity=1"
                              )

    测试时,有时候,opacity默认为0,需要变为1才会显示原图。

    显示原图之后,因为位置是一样的,同第二步,使用同一个函数进行截图。

    第四步:对比两张图片,即滑动的位移

    none_img=get_image(driver)#缺口图
        driver.execute_script("var x=document.getElementsByClassName('geetest_canvas_fullbg geetest_fade geetest_absolute')[0];"
                              "x.style.display='block';"
                              "x.style.opacity=1"
                              )
    block_img=get_image(driver)#原图

    进行图片滑动的距离的计算:

    def get_distance(img1,img2):
        start_x=60#初始X
        threhold=60#阈值
        for x in range(start_x,img1.size[0]):
            for y in range(img1.size[1]):
                rgb1=img1.load()[x,y]
                rgb2=img2.load()[x,y]
                res1=abs(rgb1[0]-rgb2[0])
                res2=abs(rgb1[1]-rgb2[1])
                res3=abs(rgb1[2]-rgb2[2])
                if not (res1<threhold and res2<threhold and res3<threhold):
                    return x-7#测试后-7可以提高成功率

    关于初始值:

    滑动验证码,缺口一定和滑块有距离,所以滑块的所占的X的范围可以排除,测量得出滑块大小约为60像素(包含边距),所以start_x=60。

    第五步:按照人的行为行为习惯,把总位移切成一段段小的位移

    人的习惯为:先加速,再减速,可能有超出的现象。

    为了保证更像人,本次有回退步骤

    def get_tracks(distance):
        #distance为上一步得出的总距离。20是等会要回退的像素
        distance+=20
        #初速度为0,s是已经走的路程,t是时间
        v0=2
        s=0
        t=0.4
       #mid是进行减速的路程
        mid=distance*3/5
       #存放走的距离
        forward_tracks=[]
        while s<distance:
            if s<mid:
                a=2
            else:
                a=-3
            #高中物理,匀加速路程的计算
            v=v0
            tance=v*t+0.5*a*(t**2)
            tance=round(tance)
            s+=tance
            v0=v+a*t
            forward_tracks.append(tance)
        #因为回退20像素,所以可以手动打出,只要和为20即可
        back_tracks = [-1, -1, -1, -2, -2, -2, -3, -3, -2, -2, -1]  # 20
        return {"forward_tracks": forward_tracks, 'back_tracks': back_tracks}

    第六步:按照距离移动

    #获得滑块元素
    geetest_slider_button=driver.find_element_by_class_name('geetest_slider_button')
        #获得距离
        distance=get_distance(block_img,none_img)
        #获得步数
        tracks_dic=get_tracks(distance)
       #点击并按住    ActionChains(driver).click_and_hold(geetest_slider_button).perform()
        forword_tracks=tracks_dic['forward_tracks']
        back_tracks=tracks_dic['back_tracks']
        for forword_track in forword_tracks:
            ActionChains(driver).move_by_offset(xoffset=forword_track,yoffset=0).perform()
        #停顿一会,更像人
        time.sleep(0.2)
        for back_tracks in back_tracks:
            ActionChains(driver).move_by_offset(xoffset=back_tracks, yoffset=0).perform()
        print(forword_tracks)
        ActionChains(driver).move_by_offset(xoffset=-3, yoffset=0).perform()
        ActionChains(driver).move_by_offset(xoffset=3, yoffset=0).perform()
        time.sleep(0.3)
        #松开鼠标
        ActionChains(driver).release().perform()

     完整代码:

    from selenium import webdriver
    from selenium.webdriver import ActionChains
    from selenium.webdriver.common.keys import Keys
    from PIL import Image
    import time
    driver=webdriver.Chrome()
    
    def get_snap(driver):
        driver.save_screenshot('snap.png')
        snap_obj=Image.open('snap.png')
        return snap_obj
    def get_image(driver):
        img_element = driver.find_element_by_xpath(
            '//div[@class="geetest_panel_next"]//canvas[@class="geetest_canvas_slice geetest_absolute"]')
        size = img_element.size
        location = img_element.location
        left=location['x']
        top=location['y']
        right=left+size['width']
        bottom=top+size['height']
        snap_obj=get_snap(driver)
        img_obj=snap_obj.crop((left,top,right,bottom))
        return img_obj
    # try:
    #     driver.get('https://www.baidu.com')
    #     driver.implicitly_wait(5)
    #     r1=driver.find_element_by_link_text('登录').click()
    #     driver.find_element_by_id('TANGRAM__PSP_10__footerULoginBtn').click()
    #     input_username=driver.find_element_by_id('TANGRAM__PSP_10__userName')
    #     input_username.send_keys('17396876501')
    #     input_password=driver.find_element_by_id('TANGRAM__PSP_10__password')
    #     input_password.send_keys('dfcver')
    #     driver.find_element_by_id('TANGRAM__PSP_10__submit').click()
    #     time.sleep(5)
    # finally:
    #     driver.close()
    def get_distance(img1,img2):
        start_x=60
        threhold=60#阈值
        for x in range(start_x,img1.size[0]):
            for y in range(img1.size[1]):
                rgb1=img1.load()[x,y]
                rgb2=img2.load()[x,y]
                res1=abs(rgb1[0]-rgb2[0])
                res2=abs(rgb1[1]-rgb2[1])
                res3=abs(rgb1[2]-rgb2[2])
                if not (res1<threhold and res2<threhold and res3<threhold):
                    return x-7
    def get_tracks(distance):
        distance+=20
        v0=2
        s=0
        t=0.4
        mid=distance*3/5
        forward_tracks=[]
        while s<distance:
            if s<mid:
                a=2
            else:
                a=-3
            v=v0
            tance=v*t+0.5*a*(t**2)
            tance=round(tance)
            s+=tance
            v0=v+a*t
            forward_tracks.append(tance)
        back_tracks = [-1, -1, -1, -2, -2, -2, -3, -3, -2, -2, -1]  # 20
        return {"forward_tracks": forward_tracks, 'back_tracks': back_tracks}
    
    try:
        driver.get('https://account.cnblogs.com/signin')
        driver.implicitly_wait(3)
        input_username=driver.find_element_by_id('LoginName')
        input_password=driver.find_element_by_id('Password')
        input_username.send_keys('928480709')
        input_password.send_keys('dfcver1112223334')
        submitBtn=driver.find_element_by_id('submitBtn')
        submitBtn.click()
        time.sleep(2)#等待验证码加载
        none_img=get_image(driver)
        driver.execute_script("var x=document.getElementsByClassName('geetest_canvas_fullbg geetest_fade geetest_absolute')[0];"
                              "x.style.display='block';"
                              "x.style.opacity=1"
                              )
        block_img=get_image(driver)
        geetest_slider_button=driver.find_element_by_class_name('geetest_slider_button')
    
        distance=get_distance(block_img,none_img)
        tracks_dic=get_tracks(distance)
        ActionChains(driver).click_and_hold(geetest_slider_button).perform()
        forword_tracks=tracks_dic['forward_tracks']
        back_tracks=tracks_dic['back_tracks']
        for forword_track in forword_tracks:
            ActionChains(driver).move_by_offset(xoffset=forword_track,yoffset=0).perform()
        time.sleep(0.2)
        for back_tracks in back_tracks:
            ActionChains(driver).move_by_offset(xoffset=back_tracks, yoffset=0).perform()
        print(forword_tracks)
        ActionChains(driver).move_by_offset(xoffset=-3, yoffset=0).perform()
        ActionChains(driver).move_by_offset(xoffset=3, yoffset=0).perform()
        time.sleep(0.3)
        ActionChains(driver).release().perform()
    
        time.sleep(60)
    finally:
        driver.close()
    完整代码
  • 相关阅读:
    c# 中的线程和同步
    Javascript 观察者模式
    连接SQLite 创建ADO.net实体类
    给软件增加注册功能 c#
    log4net 使用步骤
    C# 操作 Excel
    PCL编译历程
    设计模式
    kinect
    eclipse配置servlet错误
  • 原文地址:https://www.cnblogs.com/98WDJ/p/11050559.html
Copyright © 2011-2022 走看看