单爬虫运行
import sys from scrapy.cmdline import execute if __name__ == '__main__': execute(["scrapy","crawl","chouti","--nolog"])
然后右键运行py文件即可运行名为‘chouti‘的爬虫
同时运行多个爬虫
步骤如下:
- 在spiders同级创建任意目录,如:commands
- 在其中创建 crawlall.py 文件 (此处文件名就是自定义的命令)
- 在settings.py 中添加配置 COMMANDS_MODULE = '项目名称.目录名称'
- 在项目目录执行命令:scrapy crawlall
代码如下:
1 from scrapy.commands import ScrapyCommand 2 from scrapy.utils.project import get_project_settings 3 4 class Command(ScrapyCommand): 5 6 requires_project = True 7 8 def syntax(self): 9 return '[options]' 10 11 def short_desc(self): 12 return 'Runs all of the spiders' 13 14 def run(self, args, opts): 15 spider_list = self.crawler_process.spiders.list() 16 for name in spider_list: 17 self.crawler_process.crawl(name, **opts.__dict__) 18 self.crawler_process.start() 19 20 crawlall.py