zoukankan      html  css  js  c++  java
  • scrapy-redis 0.6.8 配置信息

    很多博客的db参数配置都不能用,所以记录一下该版本可用的配置

    #启用Redis调度存储请求队列
    SCHEDULER = "scrapy_redis.scheduler.Scheduler"
    #确保所有的爬虫通过Redis去重
    DUPEFILTER_CLASS = "scrapy_redis.dupefilter.RFPDupeFilter"
    #使用优先级调度请求队列 (默认使用)
    SCHEDULER_QUEUE_CLASS = 'scrapy_redis.queue.PriorityQueue'
    #具体参数
    REDIS_PARAMS = {
        'host': '39.107.253.135',
        'port':'63790',
        'password': '7890',
        'db': 0
    }
    

    将url放入redis

    from redis import Redis
    
    red = Redis(host='39.107.253.135', port=63790, password='7890', db='8')
    
    for page in range(1, 23, 1):
        p = (page-1)*12
        url = 'https://maoyan.com/cinemas?offset=' + str(p)
        red.lpush('maoyan:start_urls', url)
    

    在爬虫页面

    from scrapy_redis.spiders import RedisSpider
    
    class MySpider(RedisSpider):
        name = 'my'
        redis_key = 'maoyan:start_urls'
  • 相关阅读:
    HDU 5521 Meeting
    HDU 5170 GTY's math problem
    HDU 5531 Rebuild
    HDU 5534 Partial Tree
    HDU 4101 Ali and Baba
    HDU 5522 Numbers
    HDU 5523 Game
    ZUFE OJ 2301 GW I (3)
    POJ 2398 Toy Storage
    POJ 2318 TOYS
  • 原文地址:https://www.cnblogs.com/vinic-xxm/p/11753441.html
Copyright © 2011-2022 走看看