zoukankan      html  css  js  c++  java
  • chromedriver 全屏 翻页 错误

    from selenium import webdriver
    from selenium.common.exceptions import TimeoutException, StaleElementReferenceException
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from pyquery import PyQuery as pq
    browser=webdriver.Chrome()
    
    def search():
        try:
            browser.get('https://www.jd.com/')
            input=WebDriverWait(browser,10).until(
                EC.presence_of_element_located((By.CSS_SELECTOR,'#key'))
            )
            submit=WebDriverWait(browser,10).until(
                EC.element_to_be_clickable((By.CSS_SELECTOR,'#search > div > div.form > button > i'))
            )
            input.send_keys("佩奇")
            submit.click()
            total_pages=WebDriverWait(browser,10).until(
                EC.presence_of_element_located((By.CSS_SELECTOR,'#J_bottomPage > span.p-skip > em:nth-child(1) > b'))
            )
            get_product_media()
            pages=int(total_pages.text)
            return pages
        except TimeoutException:
            search()
    
    def search_page(number):
        try:
            input = WebDriverWait(browser, 20).until(
                EC.presence_of_element_located((By.CSS_SELECTOR, '#J_bottomPage > span.p-skip > input'))
            )
            submit = WebDriverWait(browser, 20).until(
                EC.element_to_be_clickable((By.CSS_SELECTOR, '#J_bottomPage > span.p-skip > a'))
            )
            input.clear()
            input.send_keys(number)
            submit.click()
            get_product_media()
            # WebDriverWait(browser, 10).until(
            #     EC.text_to_be_present_in_element((By.CSS_SELECTOR,'#J_bottomPage > span.p-num > a.curr'),str(number))
            # )
        except StaleElementReferenceException:
            search_page(number)
    
    def get_product_media():
        # try:
        WebDriverWait(browser, 10).until(
                        EC.presence_of_element_located((By.CSS_SELECTOR,'#J_goodsList .gl-item .p-img'))
                    )
        html=browser.page_source
        doc=pq(html)
        items=doc('#J_goodsList .gl-i-wrap ').items()
        for item in items:
            product={
                'image': item.find('.p-img').attr('src'),
                # 'price': item.find('.p-price').text()
                # 'image': item.find('.p-img a img').attr('data-lazy-img')
            }
            print(product)
            # print(item)
    
    
    
    
    
    
    def main():
        pages=search()
        print(type(pages))
        for i in range(2,pages+1):
            search_page(i)
    
    
    
    
    
    if __name__ == '__main__':
        main()

    运行的时候如果弹出的chrome不是全屏模式,翻页会不能运行。。。

    另:一直无法解析到正确的src,直到看了https://www.cnblogs.com/airnew/p/10101698.html,发现把html = browser.page_source.replace('xmlns', 'another_attr'),后就可以正确解析了replace('xmlns', 'another_attr')这是什么意思,原作者说要把xmls替换,试了下替换成‘an’也会工作,

    CHOOSING A SPECIFIC MATCH

    CSS selectors in Selenium allow us to navigate lists with more finess that the above methods. If we have a ul and we want to select its fourth li element without regard to any other elements, we should use nth-child or nth-of-type.

    <ul id = "recordlist">
    
    <li>Cat</li>
    
    <li>Dog</li>
    
    <li>Car</li>
    
    <li>Goat</li>
    
    </ul>
    

    If we want to select the fourth li element (Goat) in this list, we can use the nth-of-type, which will find the fourth li in the list.

    CSS: #recordlist li:nth-of-type(4)
    

    On the other hand, if we want to get the fourth element only if it is a li element, we can use a filtered nth-child which will select (Car) in this case.

    CSS: #recordlist li:nth-child(4)
    

    Note, if you don’t specify a child type for nth-child it will allow you to select the fourth child without regard to type. This may be useful in testing css layout in selenium.

    CSS: #recordlist *:nth-child(4)
    from :https://saucelabs.com/resources/articles/selenium-tips-css-selectors
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- focus on what you want to be
  • 相关阅读:
    python Exception中的raise、assert
    python循环语句与其他编程语言不同之处
    【转】最大似然估计
    尝试发表代码
    奇异分解(SVD)
    Python学习笔记3-string
    梯度下降法
    前后端分离
    Trie树
    HashMap的工作原理以及代码实现,为什么要转换成红黑树?
  • 原文地址:https://www.cnblogs.com/bamboozone/p/10334093.html
Copyright © 2011-2022 走看看