zoukankan      html  css  js  c++  java
  • python之花瓣美女下载

    python之花瓣美女下载


    作者:vpoet
    mail:vpoet_sir@163.com
    注:代码随意copy 不用告诉我

    主要功能:
        1.搜索花瓣http://huaban.com/下的图片
        2.选定下载图片数目
        3.显示下载进度
        4.创建目录并下载到桌面

    注释少,凑合看。
     1 #coding: utf-8
     2 import urllib, urllib2, re, sys, os
     3 import random
     4 reload(sys)
     5  
     6 def Schedule(a,b,c):
     7     """a:已经下载的数据块
     8     b:数据块的大小
     9     c:远程文件的大小"""
    10     per = 100.0 * a * b / c
    11     if per > 100:
    12         per = 100
    13     print '%.2f%%' % per
    14 
    15 
    16 
    17 
    18 
    19 def SearchAndDownLoadImg(SearchStr,NumPerPage,filepath):
    20     
    21     url = 'http://huaban.com/search/?q=%s&per_page=%s' % (SearchStr,str(NumPerPage))
    22     
    23     Respon = urllib2.urlopen(url)
    24     
    25     Htm = Respon.read()
    26     
    27     print url+"
    
    
    "
    28     
    29     print "----------------Search Over,And Begin DownLoad----------------"+"
    
    "
    30     
    31     Patt=re.compile('"file":{"farm":"farm1",.+?"bucket":"hbimg",.+?"key":"(.*?)",.+?"type":"image/(.*?)",.+?"width":')
    32     
    33     group = re.findall(Patt,Htm)
    34     
    35     #print "find total imgurl"+len(group)+"
    "
    36     
    37     x = 1
    38     
    39     for item in group:
    40     
    41         imgurl=r"http://img.hb.aicdn.com/"+item[0]+"_fw658"
    42         
    43         urllib.urlretrieve(imgurl,filepath+'pic%s.%s' % (str(x),item[1]),Schedule)
    44         
    45         print imgurl+"------>down over" +"	pic"+ str(x)
    46         
    47         x = x+1
    48     
    49     
    50 
    51 if __name__ == "__main__":
    52     
    53     print "Please input the picture you want to download:"
    54     
    55     SearchStr = raw_input()
    56     
    57     print "
    
    "
    58     
    59     print "Please input the PageNumber you want to download:"
    60     
    61     NumPerPage = raw_input()
    62     
    63     print "
    
    "
    64     
    65     print "-----------------------Begin Search---------------------------"+"
    "
    66     
    67     
    68     filenum = random.randint(20, 50)
    69     
    70     filename = 'PictureFile'+str(filenum)
    71     
    72     filepath = 'C:UsersAdministratorDesktop'+'\'+filename
    73     
    74     if(os.path.exists(filepath) == False):
    75         os.mkdir(filepath)
    76     
    77     #print filepath
    78     
    79     SearchAndDownLoadImg(SearchStr,NumPerPage,filepath)
    80     
    81     #http://img.hb.aicdn.com/23a58517fb73f86bca85937f069724486b3e00a44caa-GMc99I_sq75sf
    82     
    83     print"
    
    "
    84     
    85     print "---------------------All Down Over-----------------------"

    运行截图:
    搜索Beatuiful,下载20张图片

    花瓣搜索结果:


    运行截图,显示下载进度:

    图片爬取结果:
  • 相关阅读:
    JavaScript 实现深度拷贝
    JacaScript arguments
    EMACS 使用入门
    ubuntu 14.04 nginx + mysql + php源码安装
    c语言 头文件
    程序员技术练级攻略
    if和switch的选择
    .htaccess (分布式配置文件)
    yii2 windows 安装过程
    Js 冒泡事件阻止
  • 原文地址:https://www.cnblogs.com/vpoet/p/4659597.html
Copyright © 2011-2022 走看看