zoukankan      html  css  js  c++  java
  • python3用selinium批量采集cacti流量图

      selinium是一个用于Web应用程序测试的工具。Selenium测试直接运行在浏览器中,就像真正的用户在操作一样。支持的浏览器包括IE(7, 8, 9, 10, 11),Mozilla Firefox,Safari,Google Chrome,Opera等。学习python爬虫基础的人,都会接触到这个selinium框架。

      一、首先,当然是下载selinium模块,前提你已经下载了python3,还有python编辑器,比如pycharm,IDLE,Visual Studio Code等等,还有很多python编辑器,详情可查看该链接:https://baijiahao.baidu.com/s?id=1620388483830154843&wfr=spider&for=pc,博主使用的是pycharm编辑器。

      二、因为selinium框架是运行在浏览器上的,所以要先下载好浏览器对应的各浏览器驱动。一般都是用谷歌、火狐、IE浏览器,对应的浏览器驱动可以查看该链接:https://www.cnblogs.com/momolei/p/10118526.html,注意:不同的浏览器的版本对应的xxx.exe 版本也不一样,这个很重要哦。下载好的xxx.exe应该放到python3目录下。

      如果以上步骤都已经弄好,在CMD黑窗口下载selinium:pip install selinium。

      三、然后,可以在python编辑器上,调试是否可以利用selinium框架打开浏览器。比如:

    from selenium import webdriver
    
    #设置chromedriver
    browser = webdriver.Chrome("C:Program Files (x86)GoogleChromeApplicationchromedriver.exe")
    #设置超时时间
    browser.set_page_load_timeout(10)
    #打开百度网页
    browser.get("https://www.baidu.com")
    print(browser.page_source)

      如果能够看到已经打开百度,正常返回了内容,说明你已经成功了50%,安装好了selinium模块后,就可以进行cacti流量图的爬取了。

      四、好了,可以进行正文代码部分了。

      

    from selenium import webdriver
    from lxml import etree
    import time
    import datetime
    
    
    driver =webdriver.Chrome(r'D:python3.7chromedriver.exe')
    driver.get('cacti的IP地址,比如http://xxx/graph_view.php')
    name = driver.find_element_by_name("login_username")
    passwd = driver.find_element_by_name("login_password")
    name.send_keys('登录账号')
    passwd.send_keys('登录密码')
    submit = driver.find_element_by_xpath('//td/input[@value="登录"]')
    submit.click()
    i = 6
    for i in range(6,-1,-1):
        if i == 6:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days = i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list1= [format_otherStyleTime1,format_otherStyleTime2,format_otherStyleTime3,format_otherStyleTime4]
        elif i==5:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list2=[format_otherStyleTime1,format_otherStyleTime2,format_otherStyleTime3,format_otherStyleTime4]
        elif i==4:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list3 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
        elif i == 3:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list4 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
        elif i == 2:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list5 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
        elif i == 1:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            list6 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4]
        elif i == 0:
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i))
            otherStyleTime = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime1 = "%s 06:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime2 = "%s 10:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime3 = "%s 18:00:00" % otherStyleTime.split()[0]
            format_otherStyleTime4 = "%s 22:00:00" % otherStyleTime.split()[0]
            threeDayAgo = (datetime.datetime.now() - datetime.timedelta(days=i-1))
            otherStyleTime1 = threeDayAgo.strftime("%Y-%m-%d %H:%M:%S")
            format_otherStyleTime5 = "%s 06:00:00" % otherStyleTime1.split()[0]
    
            list7 = [format_otherStyleTime1, format_otherStyleTime2, format_otherStyleTime3, format_otherStyleTime4,format_otherStyleTime5]
            list = (list1+list2+list3+list4+list5+list6+list7)
        else:
            break
    for i in range(0,30):
        driver.find_element_by_name("date1").clear()  # 调用clear()方法去清除
        driver.find_element_by_name("date2").clear()
        driver.find_element_by_name("date1").send_keys(list[i])
        driver.find_element_by_name("date2").send_keys(list[i+1])
        button = driver.find_element_by_name("button_refresh_x").click()
        a = driver.find_element_by_xpath(".//tbody/tr[4]/td//table/tbody/tr/td[2]/a/img").click()
        picture_list=('%s %s'%(i,'.jpg'))
        driver.save_screenshot(picture_list)
        b = driver.find_element_by_xpath('.//tbody/tr/td//a[2]').click()
        driver.close()

      一气呵成,可以看到这个py文件下就有了你想要的流量图。我采集的流量图时间间断是根据我工作所需的要求,小伙胖可以根据自己需要的时间段进行修改。

      我觉得中间那个日期的for循环,应该是可以简单点的,但是目前还没有想到怎么优化这段代码,后续有优化再更新博文。小伙伴如果有更好的想法也可以私聊我哦。

      如需转载,请附带原创链接,感谢!

      

     

      

    如有问题请留言,谢谢!
  • 相关阅读:
    leetcode 86. Partition List
    leetcode 303. Range Sum Query
    leetcode 1310. XOR Queries of a Subarray
    leetcode 1309. Decrypt String from Alphabet to Integer Mapping
    leetcode 215. Kth Largest Element in an Array
    将numpy.ndarray写入excel
    leetcode 1021 Remove Outermost Parentheses
    leetcode 1306. Jump Game III
    leetcode 1305. All Elements in Two Binary Search Trees
    ICCV2019 oral:Wavelet Domain Style Transfer for an Effective Perception-distortion Tradeoff in Single Image Super-Resolution
  • 原文地址:https://www.cnblogs.com/yunsi/p/11727842.html
Copyright © 2011-2022 走看看