每日一报这种东西,上有政策,下就一定有对策。比如我就写了个python程序实现了自动登录填表,由于众所周知的原因,这里略去具体机构和网页,只分享代码。
首先需要安装selenium 依赖,并且还需要下载一个webdriver,我使用的是chrome的webdriver。
https://chromedriver.chromium.org/downloads
直接上代码:
1 from selenium import webdriver 2 import time 3 4 from selenium.common.exceptions import NoSuchElementException 5 6 driver = webdriver.Chrome('./chromedriver') 7 driver.get("http://XXXXXXX.XXXX.XXX.cn") 8 9 account = driver.find_element_by_id("username") 10 account.send_keys("1XXXXXXX") 11 12 pwd = driver.find_element_by_id("password") 13 pwd.send_keys("0XXXXXXXXX") 14 15 login = driver.find_element_by_id("login-submit") 16 login.click() 17 time.sleep(3) 18 try: 19 messageNotification = driver.find_element_by_id("layui-layer1") 20 except NoSuchElementException: 21 print ("No new messages") 22 else: 23 confirm = driver.find_element_by_class_name("layui-layer-btn0") 24 confirm.click() 25 try: 26 while True: 27 driver.find_element_by_css_selector('[style="color:red;"]').find_element_by_xpath('..').click() 28 rtn = driver.find_element_by_css_selector('[role="button"]') 29 rtn.click() 30 except NoSuchElementException: 31 print ("No unread message found!") 32 33 34 try: 35 while True: 36 x = driver.find_element_by_id("fineui_1") 37 x.click() 38 reportHis = driver.find_element_by_id("lnkReportHistory") 39 reportHis.click() 40 41 toWrite = driver.find_element_by_css_selector('[href^="/DayReport.aspx"]') 42 toWrite.click() 43 checkBox = driver.find_element_by_id("p1_ChengNuo-inputEl-icon") 44 checkBox.click() 45 temperature = driver.find_element_by_id("p1_TiWen-inputEl") 46 temperature.send_keys("36") 47 submit = driver.find_element_by_id("p1_ctl00_btnSubmit") 48 submit.click() 49 subConf = driver.find_element_by_id("fineui_68") 50 subConf.click() 51 time.sleep(3) 52 subConfConf = driver.find_element_by_id("fineui_73") 53 subConfConf.click() 54 except NoSuchElementException: 55 print("All reported. Exiting...") 56 57 driver.close()
先大体解释一下,9-17行进行登录操作,18-31行检查消息中心的未读消息并全部自动读取,34-55行进入每日一报,自动填写所有未填的表单。
技术上值得注意的一些地方:
- 有时,代码逻辑没有问题,但却出现类似 stale element reference: element is not attached to the page document 这样的错误,也就是webdriver找不到我们指定的元素,很有可能是因为网页还没完全加载,程序就对其进行操作。使用time.sleep() 让程序等待一下网页,一般就能解决。
- 无论是find_element_by 还是find_elements_by ,如果没有找到对应的元素,抛出一个NoSuchElementException 异常。
- 定位到父节点元素的方法:
find_element_by_xpath('..') - 使用css selector 以属性定位元素的方法:
driver.find_element_by_css_selector('[style="color:red;"]') - 用类似于正则的属性通配符,对属性值进行模糊匹配:
find_element_by_css_selector('[href^="/DayReport.aspx"]') 参考:https://www.w3.org/TR/selectors/#attribute-substrings
6.2. Substring matching attribute selectors
Three additional attribute selectors are provided for matching substrings in the value of an attribute:
- [att^=val]
- Represents an element with the
attattribute whose value begins with the prefix "val". If "val" is the empty string then the selector does not represent anything.- [att$=val]
- Represents an element with the
attattribute whose value ends with the suffix "val". If "val" is the empty string then the selector does not represent anything.- [att*=val]
- Represents an element with the
attattribute whose value contains at least one instance of the substring "val". If "val" is the empty string then the selector does not represent anything.