zoukankan      html  css  js  c++  java
  • ubuntu14.04下安装爬虫工具scrapy

    scrapy是目前准备要学习的爬虫框架,其在ubuntu14.04下的安装过程如下:

    ubuntu14.04下默认安装了2.7的python以及setuptools,若未安装,可通过下面指令安装:

    sudo apt-get install python
    sudo apt-get install python-setuptools

    然后安装Twisted:

    sudo apt-get install python-twisted

    然后是Scrapy:

    sudo apt-get install python-scrapy

    安装完成后,如果直接键入scrapy启动的话会报类似如下错误:

    File "/usr/local/bin/scrapy", line 5, in <module>
        from pkg_resources import load_entry_point
    
    ......
    
    pkg_resources.ContextualVersionConflict: (pyasn1 0.1.7 (/usr/lib/python2.7/dist-packages), Requirement.parse('pyasn1>=0.1.8'), set(['pyasn1-modules']))

    按提示是个版本依赖的错误。

    此时先安装pip:

    sudo apt-get install python-pip

    升级pip:

    sudo pip install --upgrade pip

    然后对提示的pyasn1升级:

    sudo pip install --upgrade pyasn1

    再启动scrapy就可以成功了:

    Scrapy 1.0.3 - no active project
    
    Usage:
      scrapy <command> [options] [args]
    
    Available commands:
      bench         Run quick benchmark test
      commands      
      fetch         Fetch a URL using the Scrapy downloader
      runspider     Run a self-contained spider (without creating a project)
      settings      Get settings values
      shell         Interactive scraping console
      startproject  Create new project
      version       Print Scrapy version
      view          Open URL in browser, as seen by Scrapy
    
      [ more ]      More commands available when run from project directory
    
    Use "scrapy <command> -h" to see more info about a command
  • 相关阅读:
    Hbase调用JavaAPI实现批量导入操作
    spring-quartz定时任务使用小结
    Cocos2d-x JSB 自己主动绑定bindings
    CentOS 6.x安装多GCC版本号,cmake的安装与使用
    PPAPI+Skia实现的涂鸦板
    UVA
    [ExtJS5学习笔记]第三十四节 sencha extjs 5 grid表格之java后台导出excel
    文本分析之中文分词
    cocos2d-x创建精灵动画
    美国L1签证申请的常见问题解析
  • 原文地址:https://www.cnblogs.com/caiminfeng/p/4836664.html
Copyright © 2011-2022 走看看