zoukankan      html  css  js  c++  java
  • Scrapy安转遇到问题

    最近尝试使用Scrapy进行数据抓取,并尝试在windows7 64位系统上安装scrapy,下面总结记录遇到两个问题和解决方法:
     
    scrapy官网的地址为:http://scrapy.org/
    1、首先按照官网的说明,直接pip安装scrapy,报以下错误,提示缺少VC++9.0,报错信息有给出具体的说明和解决方法。
    >>pip install scrapy
    error: Microsoft Visual C++ 9.0 is required (Unable to find vcvarsall.bat). Get it from http://aka.ms/vcpython27
    解决方法:直接点击下载VCForPython27.msi,也直接访问下面的连接https://www.microsoft.com/en-us/download/details.aspx?id=44266
     
    2、下载并安装VCForPython27.msi然后再运行pip install scrapy,再次报错,提示找不到libxml2库。
    >>pip install scrapy
    c:userszjn3645appdatalocal empxmlXPathInit7hkp2z.c(1) : fatal error C1083: Cannot open include file: 'libxml/xpath.h': No such file or directory
    *********************************************************************************
    Could not find function xmlCheckVersion in library libxml2. Is libxml2 installed?
    *********************************************************************************
    error: command 'C:\Users\zjn3645\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Bin\cl.exe' failed with exit status 2
     
    解决方法:
    使用easy_install安装lxml,然后再次安装pip install scrapy,成功~
    >>easy_install lxml
     
    3、scrapy安装完成,运行官网首页的样例报错,原因是缺少pywin32
    >>scrapy runspider myspider.py
    exceptions.ImportError: No module named win32api
    2016-03-09 10:17:49 [twisted] CRITICAL:
     
    解决方法:按照官方文档的说明安装
    pywin32有32位和64位版本,
    对于64位的windows 7,如果python环境变量已经正确安装,pywin32安装的时候仍然报找不到python的错误,尝试pywin32的32位版本。
     
    总结:在安装和使用新的工具之前一定要先看看官方文档的说明,搞清楚安装和使用的前提条件!
    4、关闭代理
    默认使用代理,有些网页使用本地代理无法访问!
    2016-03-09 15:18:21 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
    2016-03-09 15:18:21 [scrapy] DEBUG: Crawled (403) <GET http://xxx.com.cn/xxx.html>
    (referer: None)
    2016-03-09 15:18:21 [scrapy] DEBUG: Ignoring response <403 http://xxx.com.cn/xxx.html>: HTTP status code is not handled or not allowed
    关闭代理
    修改settings.py如下
    DOWNLOADER_MIDDLEWARES = {
        'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
    }
     
     
     
     

    1  解决办法 

    手动安装 lxml包

    http://pypi.python.org/simple/lxml/,下载win7 x64的(注意下载对应你系统版本):

    https://pypi.python.org/packages/2.7/l/lxml/lxml-2.2.8.win-amd64-py2.7.exe#md5=cfcf7f07a5016a5934271cddde4bbcbe

    然后重新打开cmd去pip install Scrapy,解决了此xml的问题了

     
     
     mac安装遇到问题:
    I resolved a problem ,when you you install scrapy-----{mac os system}, maybe you will get error like:
    
    '''
    sted>=10.0.0->Scrapy)
    Installing collected packages: six, w3lib, parsel, PyDispatcher, Twisted, Scrapy
      Found existing installation: six 1.4.1
        DEPRECATION: Uninstalling a distutils installed project (six) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
        Uninstalling six-1.4.1:
    Exception:
    Traceback (most recent call last):
      File "/Library/Python/2.7/site-packages/pip-8.1.1-py2.7.egg/pip/basecommand.py", line 209, in main
        status = self.run(options, args)
      File "/Library/Python/2.7/site-packages/pip-8.1.1-py2.7.egg/pip/commands/install.py", line 317, in run
        prefix=options.prefix_path,
      File "/Library/Python/2.7/site-packages/pip-8.1.1-py2.7.egg/pip/req/req_set.py", line 726, in install
        requirement.uninstall(auto_confirm=True)
      File "/Library/Python/2.7/site-packages/pip-8.1.1-py2.7.egg/pip/req/req_install.py", line 746, in uninstall
        paths_to_remove.remove(auto_confirm)
      File "/Library/Python/2.7/site-packages/pip-8.1.1-py2.7.egg/pip/req/req_uninstall.py", line 115, in remove
        renames(path, new_path)
      File "/Library/Python/2.7/site-packages/pip-8.1.1-py2.7.egg/pip/utils/__init__.py", line 267, in renames
        shutil.move(old, new)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 302, in move
        copy2(src, real_dst)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 131, in copy2
        copystat(src, dst)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 103, in copystat
        os.chflags(dst, st.st_flags)
    OSError: [Errno 1] Operation not permitted: '/tmp/pip-ZVi5QO-uninstall/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/six-1.4.1-py2.7.egg-info'
    You are using pip version 8.1.1, however version 8.1.2 is available.
    You should consider upgrading via the 'pip install --upgrade pip' command.
    LuoTimdeMacBook-Pro-2:~ luotim$ sudo pip install Scrapy --ingnore-installed six
    
    '''
    
    Six is a Python 2 and 3 compatibility library.
    
    frist thanks google and what's fuck baidu ! so you should be do this to resolved the problem:
    1、Download the six-1.10.0.tar.gz package
    wget https://pypi.python.org/packages/b3/b2/238e2590826bfdd113244a40d9d3eb26918bd798fc187e2360a8367068db/six-1.10.0.tar.gz#md5=34eed507548117b2ab523ab14b2f8b55
    
    2、UnZip software package
    tar -zxvf six-1.10.0.tar.gz
    
    3、Use this command to install it.
    cd cd six-1.10.0
    sudo python setup.py install
    
    http://stackoverflow.com/questions/29485741/unable-to-upgrade-python-six-package-in-mac-osx-10-10-2
     
  • 相关阅读:
    opencv中彩色图转换成灰度图rgb2gray
    C和MATLAB中:同时对多个变量连续赋值
    error LNK2019: 无法解析的外部符号
    空间中两直线位置关系
    百度网盘不能绑定QQ
    matlab中双站异面直线法定位目标
    Sublime Text3编辑器简介
    Sybase IQ使用过程中注意事项
    Sybase数据库常用函数
    Sybase数据库第三方软件安装
  • 原文地址:https://www.cnblogs.com/fengzaoye/p/5907669.html
Copyright © 2011-2022 走看看