zoukankan      html  css  js  c++  java
  • Python之Scrapy遇见个坑

    运行Scrapy爬虫被限制抓取,报错:

    2018-01-08 18:37:14 [scrapy.middleware] INFO: Enabled item pipelines:
    []
    2018-01-08 18:37:14 [scrapy.core.engine] INFO: Spider opened
    2018-01-08 18:37:14 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
    2018-01-08 18:37:14 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
    2018-01-08 18:37:23 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://accounts.douban.com/login> (referer: None)
    2018-01-08 18:37:23 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://accounts.douban.com/login>: HTTP status code is not handled or not allowed
    2018-01-08 18:37:23 [scrapy.core.engine] INFO: Closing spider (finished)
    2018-01-08 18:37:23 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
    {'downloader/request_bytes': 222,

    解决方法:

    settings.py中添加用户代理

    USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.54 Safari/536.5' 

    搞定。。。

  • 相关阅读:
    ZOJ 1002 Fire Net
    Uva 12889 One-Two-Three
    URAL 1881 Long problem statement
    URAL 1880 Psych Up's Eigenvalues
    URAL 1877 Bicycle Codes
    URAL 1876 Centipede's Morning
    URAL 1873. GOV Chronicles
    Uva 839 Not so Mobile
    Uva 679 Dropping Balls
    An ac a day,keep wa away
  • 原文地址:https://www.cnblogs.com/pyyu/p/8244213.html
Copyright © 2011-2022 走看看