zoukankan      html  css  js  c++  java
  • Requests库网络爬虫实战

    实例一:京东商品页面的爬取

    import requests
    url = "https://item.jd.com/100004770237.html"
    try:
       r = requests.get(url)
       r.raise_for_status()
       r.encoding = r.apparent_encoding
       print(r.text[:1000])
    except:
        print("爬取失败") 

    实例二:亚马逊商品页面的爬取

    import requests
    url = "https://www.amazon.cn/dp/B071HXVPXG/ref=lp_659039051_1_2?s=books&ie=UTF8&qid=1580353560&sr=1-2"
    try:
       kv = {'user-agent' :'Mozilla/5.0'}    
       r = requests.get(url , headers = kv)
       r.raise_for_status()
       r.encoding = r.apparent_encoding
       print(r.text[1000:2000])
    except:
        print("爬取失败") 

    实例三:百度360搜索关键词提交

    import requests
    keyword = "python"
    try:
        kv = {'q' : keyword}
        r = requests.get("http://www.so.com/s",params = kv)
        print(r.request.url)
        r.raise_for_status()
        print(len(r.text))   
    except:
        print("爬取失败")

    备注:搜索引擎关键词提交接口

    百度的关键词接口:http://www.baidu.com/s?wd=keyword

    360的关键词接口:http://www.so.com/s?q=keyword

    实例四:网络图片的爬取和存储

    import requests
    import os
    url = "http://img1.3lian.com/2015/w7/97/d/25.jpg"
    #设置爬取图片的存储位置及名称,名称可以使用图片原有的名称也可以自定义
    root = "E://python//"
    path = root + url.split('/')[-1]
    try:
        if not os.path.exists(root):
            os.mkdir(root)
        if not os.path.exists(path):
            r = requests.get(url)
            with open (path , 'wb' ) as f:
                f.write(r.content)
                f.close()
                print("文档保存成功")
        else:
            print("文件已经存在在")
    except:
        print("爬取失败")
            
    

    实例五:IP地址归属地的自动查询

    import requests
    url = "http://m.ip138.com/ip.asp?ip="
    try:
        r = requests.get(url+'202.204.80.112')
        r.raise_for_status()
        r.encoding = r.apparent_encoding
        print(r.text[-500:])
    except:
        print("爬取失败")
    

      

  • 相关阅读:
    winform+c#之窗体之间的传值 Virus
    ASP.NET 2.0 利用 checkbox获得选中行的行号, 在footer中显示 Virus
    .NET中的winform的listview控件 Virus
    我的书橱
    Expert .NET 2.0 IL Assembler·译者序一 写在一稿完成之即
    Verbal Description of Custom Attribute Value
    AddressOfCallBacks in TLS
    下一阶段Schedule
    2008 Oct MVP OpenDay 第二天 博客园聚会
    2008 Oct MVP OpenDay 第二天 颁奖·讲座·晚会
  • 原文地址:https://www.cnblogs.com/py2019/p/12242318.html
Copyright © 2011-2022 走看看