zoukankan      html  css  js  c++  java
  • python3用selinium批量采集cacti流量图

      selinium是一个用于Web应用程序测试的工具。Selenium测试直接运行在浏览器中,就像真正的用户在操作一样。支持的浏览器包括IE(7, 8, 9, 10, 11),Mozilla Firefox,Safari,Google Chrome,Opera等。学习python爬虫基础的人,都会接触到这个selinium框架。

      一、首先,当然是下载selinium模块,前提你已经下载了python3,还有python编辑器,比如pycharm,IDLE,Visual Studio Code等等,还有很多python编辑器,详情可查看该链接:https://baijiahao.baidu.com/s?id=1620388483830154843&wfr=spider&for=pc,博主使用的是pycharm编辑器。

      二、因为selinium框架是运行在浏览器上的,所以要先下载好浏览器对应的各浏览器驱动。一般都是用谷歌、火狐、IE浏览器,对应的浏览器驱动可以查看该链接:https://www.cnblogs.com/momolei/p/10118526.html,注意:不同的浏览器的版本对应的xxx.exe 版本也不一样,这个很重要哦。下载好的xxx.exe应该放到python3目录下。

      如果以上步骤都已经弄好,在CMD黑窗口下载selinium:pip install selinium。

      三、然后,可以在python编辑器上,调试是否可以利用selinium框架打开浏览器。比如:

    from selenium import webdriver
    
    #设置chromedriver
    browser = webdriver.Chrome("C:Program Files (x86)GoogleChromeApplicationchromedriver.exe")
    #设置超时时间
    browser.set_page_load_timeout(10)
    #打开百度网页
    browser.get("https://www.baidu.com")
    print(browser.page_source)

      如果能够看到已经打开百度,正常返回了内容,说明你已经成功了50%,安装好了selinium模块后,就可以进行cacti流量图的爬取了。

      四、好了,可以进行正文代码部分了。

      

    from selenium import webdriver
    from lxml import etree
    import time
    import datetime
    
    
    driver =webdriver.Chrome(r'D:python3.7chromedriver.exe')
    driver.get('cacti的IP地址,比如http://xxx/graph_view.php')
    name = driver.find_element_by_name("login_username")
    passwd = driver.find_element_by_name("login_password")
    name.send_keys('登录账号')
    passwd.send_keys('登录密码')
    submit = driver.find_element_by_xpath('//td/input[@value="登录"]')
    submit.click()
    i = 6
    for i in range(6,-1,-1):
        if i == 6:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days = i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list1= [format_otherStyleTime1,format_otherStyleTime2,format_otherStyleTime3,format_otherStyleTime4]
        elif i==5:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list2=[format_otherStyleTime1,format_otherStyleTime2,format_otherStyleTime3,format_otherStyleTime4]
        elif i==4:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list3 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
        elif i == 3:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list4 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
        elif i == 2:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list5 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
        elif i == 1:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list6 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
        elif i == 0:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i-1))
            otherStyleTime1 = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime5 = "%s 06:00:00" % otherStyleTime1.split()[0]
    
            list7 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4,format_otherStyleTime5]
            list = (list1+list2+list3+list4+list5+list6+list7)
        else:
            break
    for i in range(0,30):
        driver.find_element_by_name("date1").clear()  # 调用clear()方法去清除
        driver.find_element_by_name("date2").clear()
        driver.find_element_by_name("date1").send_keys(list[i])
        driver.find_element_by_name("date2").send_keys(list[i+1])
        button = driver.find_element_by_name("button_refresh_x").click()
        a = driver.find_element_by_xpath(".//tbody/tr[4]/td//table/tbody/tr/td[2]/a/img").click()
        picture_list=('%s %s'%(i,'.jpg'))
        driver.save_screenshot(picture_list)
        b = driver.find_element_by_xpath('.//tbody/tr/td//a[2]').click()
        driver.close()

      一气呵成,可以看到这个py文件下就有了你想要的流量图。我采集的流量图时间间断是根据我工作所需的要求,小伙胖可以根据自己需要的时间段进行修改。

      我觉得中间那个日期的for循环,应该是可以简单点的,但是目前还没有想到怎么优化这段代码,后续有优化再更新博文。小伙伴如果有更好的想法也可以私聊我哦。

      如需转载,请附带原创链接,感谢!

      

     

      

    如有问题请留言,谢谢!
  • 相关阅读:
    tomcat调试模式出问题的解决方法
    文本输入 的 onfucus 和 onblur
    how to choose one of compenent and control
    C# Captcha 测试 firefox 和 IE
    谈谈对GridView控件DataKeyName属性的一点认识
    Response.Redirect和Server.Transfer的区别
    select 基本常用语法
    top、postop、scrolltop、scrollHeight、offsetHeight
    onchange 和 onpropertychange区别
    try catch 和if else 语句区别细说
  • 原文地址:https://www.cnblogs.com/yunsi/p/11727842.html
Copyright © 2011-2022 走看看