zoukankan      html  css  js  c++  java
  • Python 爬虫 爬照片练习

    本次爬的照片网址为:http://image.baidu.com/search/index?tn=baiduimage&ps=1&ct=201326592&lm=-1&cl=2&nc=1&ie=utf-8&word=%E6%9D%A8%E5%B9%82

    进去后,F12进入开发人员选项,笔者用的是Chrome。

    右键所选图片>>检查  在弹出的控制台中找到所需照片的url

    然后开始编写代码一步一步的测试往前走

    import requests as r
    import os
    from lxml import etree
    
    #保存的文件名
    folder = 'ym_img.jpg'
    #爬取的地址
    url = 'http://image.baidu.com/search/detail?ct=503316480&z=0&ipn=false&word=%E6%9D%A8%E5%B9%82&step_word=&hs=0&pn=0&spn=0&di=155870&pi=0&rn=1&tn=baiduimagedetail&is=0%2C0&istype=0&ie=utf-8&oe=utf-8&in=&cl=2&lm=-1&st=undefined&cs=1587868577%2C2503777591&os=3415000385%2C72811486&simid=3344418590%2C75758866&adpicid=0&lpn=0&ln=3884&fr=&fmq=1591432458743_R&fm=&ic=undefined&s=undefined&hd=undefined&latest=undefined&copyright=undefined&se=&sme=&tab=0&width=&height=&face=undefined&ist=&jit=&cg=star&bdtype=0&oriquery=&objurl=http%3A%2F%2Fimage14.m1905.cn%2Fuploadfile%2F2016%2F0830%2F20160830093355110835.jpg&fromurl=ippr_z2C%24qAzdH3FAzdH3Fooo_z%26e3B8lac_z%26e3Bv54AzdH3FgjofAzdH3Fda8mabnaAzdH3F8a089mm_z%26e3Bfip4s%3Fu6%3Dooofpw6_gjof_x2xo_9_da898a80&gsm=1&rpstart=0&rpnum=0&islist=&querylist='
    
    #模拟浏览器
    header = 
        {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36'}
    
    
    #获取网页源码
    response = r.get(url,headers=header)
    html = etree.HTML(response.text)
    srctext = html.xpath('//*[@id="currentImg"]/@src')[0]
    print(srctext)
    data = r.get(srctext)
    with open(folder,'wb') as img:
        r.get(srctext)
        with r.get(url, stream=True) as resp:
            with open('demo.jpg', 'wb') as fd:
                for chunk in resp.iter_content():
                    fd.write(chunk)
    print(resp.request.headers) print(resp.status_code)

      

  • 相关阅读:
    自我介绍
    oracle 几种分页
    for update nowait
    彻底卸载SQL 2005
    如何将 SQL SERVER 彻底卸载干净
    iBatis 到 MyBatis区别
    SqlMapConfig.xml配置文件详解
    mybatis简单应用(基于配置文件)
    ibatis学习笔记一:sqlMapConfig.xml文件配置详解
    Extjs 中的添加事件总结
  • 原文地址:https://www.cnblogs.com/TTTAO/p/13060361.html
Copyright © 2011-2022 走看看