zoukankan      html  css  js  c++  java
  • 机器学习——小白学习Linux(二)爬取并保存图片

    代码参考链接:https://www.cnblogs.com/chenyuan404/p/10192758.html

    首先进入环境并cd 到指定文件夹下 输入命令【vi food_pic.py】新建food_pic.py文件,进入编辑模式输入代码。输入命令【python food_pic.py】运行代码

    分析网站

    查看网页源代码

     

     通过正则表达式获取图片链接 re

     1 import requests
     2 import re
     3 from urllib import request
     4 
     5 
     6 #模拟浏览器获取图片链接
     7 def Get_PIC_list(keyword,max_page):
     8     all_picture_list = []
     9     for page in range(max_page):
    10         page = page *30
    11         url = 'https://image.baidu.com/search/flip?tn=baiduimage&ie=utf-8&word={}&pn={}'.format(keyword, page)
    12         html = requests.get(url).content.decode('utf-8')
    13         picture_list = re.findall('{"thumbURL":"(.*?)",',html)
    14         all_picture_list.extend(picture_list)
    15 
    16     all_picture_list = set(all_picture_list)
    17     download_picture(all_picture_list)
    18 
    19 #下载图片
    20 def download_picture(all_picture_list):
    21     for i,pic_url in enumerate(all_picture_list):
    22         print(i)
    23         string = 'picture/{}.jpg'.format(str(i + 1))
    24         request.urlretrieve(pic_url, string)
    25 
    26 #开始函数
    27 def start():
    28     keyword = '美食照片'
    29     max_page = 2
    30     Get_PIC_list(keyword,max_page)
    31 
    32 
    33 if __name__ == '__main__':
    34     start()
  • 相关阅读:
    深入了解Struts2返回JSON数据的原理及具体应用范例
    Struts国际化
    LeetCode Balanced Binary Tree
    LeetCode Triangle
    Binary Tree Level Order Traversal
    Pow(x,n)
    Symmetric Tree
    LeetCode Word Search
    LeetCode Insert Interval
    Maximum Depth of Binary Tree
  • 原文地址:https://www.cnblogs.com/cfancy/p/12800714.html
Copyright © 2011-2022 走看看