zoukankan      html  css  js  c++  java
  • 利用chrome浏览器爬取数据

    相关的库自己下载吧,直接上代码

    from selenium import webdriver
    from bs4 import BeautifulSoup
    import time
    
    #手动添加路径
    path = "C:Program Files (x86)GoogleChromeApplicationchromedriver.exe"
    driver = webdriver.Chrome(executable_path=path)
    
    url = "https://www.huomao.com/channel/lol"
    
    # 司机开车了
    driver.get(url)
    time.sleep(5)
    # 让页面移到最下面点击加载,连续6次,司机会自动更新!!
    # for i in range(6):
    #     driver.find_element_by_id("获取更多").click()
    #     time.sleep(1)
    
    # 开始解析
    soup = BeautifulSoup(driver.page_source, "html.parser")
    
    
    page_all = soup.find("div", attrs={"id": "channellist"})
    
    pages = page_all.find_all("div", attrs={"class": "list-smallbox no-logo"})
    
    
    
    for page in pages:
        aa=page.find('a')
        # print(aa)
        # print(aa.attrs['title'])
        bb=page.find('em').string.strip()
        print("主播房间:" + bb)
        cc=page.find('span',attrs={"class": "nickname"}).string.strip()
        print("主播:" + cc)
    
        dd = page.find_all('em')
        if len(dd)==2:
            ee = dd[1].find('span').string.strip()
            print('人气:' + ee)
        else:
            print('人气:主播休息了' )
        # print(len(dd))
        # for dds in dd:
        #     print(dds)
  • 相关阅读:
    LeetCode "Median of Two Sorted Arrays"
    LeetCode "Distinct Subsequences"
    LeetCode "Permutation Sequence"

    LeetCode "Linked List Cycle II"
    LeetCode "Best Time to Buy and Sell Stock III"
    LeetCode "4Sum"
    LeetCode "3Sum closest"
    LeetCode "3Sum"
    LeetCode "Container With Most Water"
  • 原文地址:https://www.cnblogs.com/szyicol/p/9415860.html
Copyright © 2011-2022 走看看