在spiders文件同级建立一个commands文件夹,建立一个py文件,我自定义为crawlall.py。
from scrapy.commands import ScrapyCommand class Command(ScrapyCommand): requires_project = True def syntax(self): return '[options]' def short_desc(self): return 'Runs all of the spiders' def run(self, args, opts): spider_list = self.crawler_process.spiders.list() for name in spider_list: self.crawler_process.crawl(name, **opts.__dict__) self.crawler_process.start()
在settings文件里把刚建立的crawlall文件的路径设置好
COMMANDS_MODULE = "ProxyPool.commands"
最后在cmd下 scrapy crawlall 运行
***********************************************************************
如果需要运行单个爬虫并指定参数可以:
scrapy crawl onespider -s LOG_FILE='123.log' 命令,-s传入指定参数
来自:https://blog.csdn.net/u014248032/article/details/83351291