zoukankan      html  css  js  c++  java
  • 【转】Python 代码批量抓取免费高清图片!

    import requests
    from bs4 import BeautifulSoup
    import random
    import time
    from fake_useragent import UserAgent
    
    
    for page in range(1, 11):
        fst_url = r'https://colorhub.me/search?tag=data&page={}'.format(page)
        UA = UserAgent()
        fst_response = requests.get(fst_url, headers={'User-Agent': UA.random})
        fst_soup = BeautifulSoup(fst_response.text, 'lxml')
        # print(fst_soup.findAll(name='div'))
        # exit()
        sec_urls = [i.find('a')['href'] for i in fst_soup.findAll(name='div', attrs={'class': 'card'})]
        pic_names = [i.find('a')['title'] for i in fst_soup.findAll(name = 'div', attrs={'class':'card'})]
        for sec_url, pic_name in zip(sec_urls, pic_names):
            UA = UserAgent()
            ua = UA.random
            sec_response = requests.get(sec_url, headers={'User-Agent': ua})
            sec_soup = BeautifulSoup(sec_response.text, 'lxml')
            pic_url = 'https:'+sec_soup.find('img', {'class': 'card-img-top'})['src']
            pic_response = requests.get(pic_url, headers={'User-Agent': ua})
            with open(pic_name+'.jpg', mode='wb') as fn:
                fn.write(pic_response.content)
                print(pic_name)
                seconds = random.uniform(1,3)
                time.sleep(seconds)
                fn.close()

    请各位大虾赐教!

  • 相关阅读:
    登乐游原
    遇到Tomcat端口占用怎么办
    tensorflow cnn+rnn基本结构
    linux bash 入门
    python 装饰器
    php 后端开发学习
    图像增强方法
    git 使用
    斯坦福机器学习课程笔记
    django学习笔记
  • 原文地址:https://www.cnblogs.com/zhzhang/p/11239645.html
Copyright © 2011-2022 走看看