zoukankan      html  css  js  c++  java
  • win7安装python爬虫框架scrapy全纪录

    1、安装Python(建议32位)##

    建议安装Python2.7.x,3.x貌似还不支持。
    安装完了记得配置环境,将python目录和python目录下的Scripts目录添加到系统环境变量的Path里。
    在cmd中输入python如果出现版本信息说明配置完毕。

    2、安装setuptools##

    用来安装egg文件,点击这里下载python2.7的对应版本的setuptools。

    3、安装lxml##

    lxml是一种使用 Python 编写的库,可以迅速、灵活地处理 XML。点击这里选择对应的Python版本安装。也可以命令行下输入

    easy-install lxml
    

    4、安装zope.interface##

    可以使用第三步下载的setuptools来安装egg文件,

    easy-install zope.interface
    

    现在也有exe版本,点击这里下载。

    5、安装Twisted##

    Twisted是用Python实现的基于事件驱动的网络引擎框架,点击这里下载。

    easy-install Twisted
    

    6、安装pyOpenSSL##

    pyOpenSSL是Python的OpenSSL接口,点击这里下载。也可以

    easy_install pyOpenSSL==0.13
    

    7、安装win32py##

    提供win32api,点击这里下载
    安装win32api可能遇到python version 2.7 required 错误,这是需要将如下代码保存为一个文件register.py

    #
    # script to register Python 2.0 or later for use with win32all
    # and other extensions that require Python registry settings
    #
    # written by Joakim Loew for Secret Labs AB / PythonWare
    #
    # source:
    # http://www.pythonware.com/products/works/articles/regpy20.htm
    #
    # modified by Valentine Gogichashvili as described in http://www.mail-archive.com/distutils-sig@python.org/msg10512.html
     
    import sys
     
    from _winreg import *
     
    # tweak as necessary
    version = sys.version[:3]
    installpath = sys.prefix
     
    regpath = "SOFTWARE\Python\Pythoncore\%s\" % (version)
    installkey = "InstallPath"
    pythonkey = "PythonPath"
    pythonpath = "%s;%s\Lib\;%s\DLLs\" % (
        installpath, installpath, installpath
    )
     
    def RegisterPy():
        try:
            reg = OpenKey(HKEY_CURRENT_USER, regpath)
        except EnvironmentError as e:
            try:
                reg = CreateKey(HKEY_CURRENT_USER, regpath)
                SetValue(reg, installkey, REG_SZ, installpath)
                SetValue(reg, pythonkey, REG_SZ, pythonpath)
                CloseKey(reg)
            except:
                print "*** Unable to register!"
                return
            print "--- Python", version, "is now registered!"
            return
        if (QueryValue(reg, installkey) == installpath and
            QueryValue(reg, pythonkey) == pythonpath):
            CloseKey(reg)
            print "=== Python", version, "is already registered!"
            return
        CloseKey(reg)
        print "*** Unable to register!"
        print "*** You probably have another Python installation!"
     
    if __name__ == "__main__":
        RegisterPy()
    

    然后命令行下执行

    python register.py
    

    即可注册python2.7

    8、安装Scrapy##

    终于到了激动人心的时候了!安装了那么多小部件之后终于轮到主角登场。
    直接在cmd中输入easy_install scrapy回车即可。
    出现如下错误

    error: Setup script exited with error: Microsoft Visual C++ 9.0 is required (Unable to find vcvarsall.bat). Get it from
    http://aka.ms/vcpython27
    

    解决办法的前提是,系统中必须安装了vs的相关版本。

    For Windows installations:

    While running setup.py for package installations, Python 2.7 searches for an installed Visual Studio 2008. You can trick Python to use a newer Visual Studio by setting the correct path in VS90COMNTOOLS environment variable before calling setup.py.

    If you have Visual Studio 2010 installed, execute

    SET VS90COMNTOOLS=%VS100COMNTOOLS%
    

    or with Visual Studio 2012 installed (Visual Studio Version 11)

    SET VS90COMNTOOLS=%VS110COMNTOOLS%
    

    or with Visual Studio 2013 installed (Visual Studio Version 12)

    SET VS90COMNTOOLS=%VS120COMNTOOLS%
    

    9、检查安装##

    打开一个cmd窗口,在任意位置执行scrapy命令,得到下列页面,表示环境配置成功。

    Scrapy 0.24.4 - no active project
    
    Usage:
      scrapy <command> [options] [args]
    
    Available commands:
      bench         Run quick benchmark test
      fetch         Fetch a URL using the Scrapy downloader
      runspider     Run a self-contained spider (without creating a project)
      settings      Get settings values
      shell         Interactive scraping console
      startproject  Create new project
      version       Print Scrapy version
      view          Open URL in browser, as seen by Scrapy
    
      [ more ]      More commands available when run from project directory
    
    Use "scrapy <command> -h" to see more info about a command
    
  • 相关阅读:
    LOJ6395 「THUPC2018」城市地铁规划 / City
    [题解] 好好
    [题解] CF1316F Battalion Strength
    【题解】CF1320D Reachable Strings
    【题解】夕张的改造
    【题解】期望次数
    [FJWC2020] lg
    Django 多表操作
    Django 单表操作
    Django 模板层 静态文件
  • 原文地址:https://www.cnblogs.com/pang1567/p/4168768.html
Copyright © 2011-2022 走看看