zoukankan      html  css  js  c++  java
  • Scrapy安装全过程(附带所有需要的软件)

    原文地址,我把安装过程中需要的软件做了备份,防止以后那些网站资源挂掉。

    我的网盘里放的Twisted和zope.interface分别有两个版本,之前我都用他们的最新版装完发现有错误,好像是缺少模块win32api。我又重新按照那些步骤重新装了一次,不同的是,这次我用了他们俩的低版本,装完就好了,scrapy的版本用新的没问题。

    Twisted-12.1.0.win32-py2.7.msi  

    Twisted-13.0.0.win32-py2.7.msi

    zope.interface-4.0.1-py2.7-win32.egg

    zope.interface-4.0.5-py2.7-win32.egg

    一、 Scrapy简介

    Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

    官方主页: http://www.scrapy.org/

    二、 安装Python2.7

    官方主页:http://www.python.org/

    下载地址:http://www.python.org/ftp/python/2.7.3/python-2.7.3.msi

    1) 安装python

    安装目录:D:\Python27

    2) 添加环境变量

    略System Properties -> Advanced -> Environment Variables - >System Variables -> Path -> Edit

    3) 验证环境变量

    T:\>set Path
    Path=C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;D:\Rational\common;D:\Rational\ClearCase\bin;D:\Python27;D:\Python27\Scripts
    PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH

    4) 验证Python

    T:\>python
    Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> exit()
    
    T:\>

    三、 安装Twisted

    Twisted is an event-driven networking engine written in Python and licensed under the open source

    1) 安装setuptools

    Download, build, install, upgrade, and uninstall Python packages -- easily!

    官方主页:http://pypi.python.org/pypi/setuptools

    下载地址:http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11.win32-py2.7.exe

    安装过程:略

    2) 安装Zope.Interface

    官方主页:http://pypi.python.org/pypi/zope.interface/

    下载地址:http://pypi.python.org/packages/2.7/z/zope.interface/zope.interface-4.0.1-py2.7-win32.egg

    安装过程:

    T:\>d:
    D:\>cd D:\Python27\Scripts
    D:\Python27\Scripts>easy_install.exe zope.interface-4.0.1-py2.7-win32.egg
    Processing zope.interface-4.0.1-py2.7-win32.egg
    creating d:\python27\lib\site-packages\zope.interface-4.0.1-py2.7-win32.egg
    Extracting zope.interface-4.0.1-py2.7-win32.egg to d:\python27\lib\site-packages
    Adding zope.interface 4.0.1 to easy-install.pth file
    
    Installed d:\python27\lib\site-packages\zope.interface-4.0.1-py2.7-win32.egg
    Processing dependencies for zope.interface==4.0.1
    Finished processing dependencies for zope.interface==4.0.1
    
    D:\Python27\Scripts>

    验证安装:

    D:\Python27\Scripts>python
    Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import zope.interface
    >>>

    3) 安装Twisted

    官方主页:http://twistedmatrix.com/trac/wiki/TwistedProject

    下载地址:http://pypi.python.org/packages/2.7/T/Twisted/Twisted-12.1.0.win32-py2.7.msi

    安装过程:略

    四、 安装w3lib

    官方主页:http://pypi.python.org/pypi/w3lib

    下载地址: http://pypi.python.org/packages/source/w/w3lib/w3lib-1.2.tar.gz

    解压过程:略

    安装过程:

    T:\w3lib-1.2>python setup.py install
    running install
    running build
    running build_py
    creating build
    creating build\lib
    creating build\lib\w3lib
    copying w3lib\encoding.py -> build\lib\w3lib
    copying w3lib\form.py -> build\lib\w3lib
    copying w3lib\html.py -> build\lib\w3lib
    copying w3lib\http.py -> build\lib\w3lib
    copying w3lib\url.py -> build\lib\w3lib
    copying w3lib\util.py -> build\lib\w3lib
    copying w3lib\__init__.py -> build\lib\w3lib
    running install_lib
    creating D:\Python27\Lib\site-packages\w3lib
    copying build\lib\w3lib\encoding.py -> D:\Python27\Lib\site-packages\w3lib
    copying build\lib\w3lib\form.py -> D:\Python27\Lib\site-packages\w3lib
    copying build\lib\w3lib\html.py -> D:\Python27\Lib\site-packages\w3lib
    copying build\lib\w3lib\http.py -> D:\Python27\Lib\site-packages\w3lib
    copying build\lib\w3lib\url.py -> D:\Python27\Lib\site-packages\w3lib
    copying build\lib\w3lib\util.py -> D:\Python27\Lib\site-packages\w3lib
    copying build\lib\w3lib\__init__.py -> D:\Python27\Lib\site-packages\w3lib
    byte-compiling D:\Python27\Lib\site-packages\w3lib\encoding.py to encoding.pyc
    byte-compiling D:\Python27\Lib\site-packages\w3lib\form.py to form.pyc
    byte-compiling D:\Python27\Lib\site-packages\w3lib\html.py to html.pyc
    byte-compiling D:\Python27\Lib\site-packages\w3lib\http.py to http.pyc
    byte-compiling D:\Python27\Lib\site-packages\w3lib\url.py to url.pyc
    byte-compiling D:\Python27\Lib\site-packages\w3lib\util.py to util.pyc
    byte-compiling D:\Python27\Lib\site-packages\w3lib\__init__.py to __init__.pyc
    running install_egg_info
    Writing D:\Python27\Lib\site-packages\w3lib-1.2-py2.7.egg-info
    
    T:\w3lib-1.2>

    验证安装:

    T:\>python
    Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import w3lib
    >>> 

    五、 安装libxml2

    官方主页:http://users.skynet.be/sbi/libxml-python/http://pypi.python.org/pypi/pyOpenSSL

    下载地址:http://users.skynet.be/sbi/libxml-python/binaries/libxml2-python-2.7.7.win32-py2.7.exe

    安装过程:略

    验证安装:

    T:\>python
    Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import libxml2
    >>> 

    六、 安装pyOpenSSL

    官方主页:http://pypi.python.org/pypi/pyOpenSSL

    下载地址:http://pypi.python.org/packages/2.7/p/pyOpenSSL/pyOpenSSL-0.13.winxp32-py2.7.msi

    安装过程:略

    验证安装:

    T:\>python
    Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import OpenSSL
    >>>

    七、 安装Scrapy

    官方主页:http://scrapy.org/

    下载地址:http://pypi.python.org/packages/source/S/Scrapy/Scrapy-0.14.4.tar.gz

    解压过程:略

    安装过程:

    T:\Scrapy-0.14.4>python setup.py install
    
    ……
    Installing easy_install-2.7-script.py script to D:\Python27\Scripts
    Installing easy_install-2.7.exe script to D:\Python27\Scripts
    Installing easy_install-2.7.exe.manifest script to D:\Python27\Scripts
    
    Using d:\python27\lib\site-packages
    Finished processing dependencies for Scrapy==0.14.4
    
    T:\Scrapy-0.14.4>

     安装过程中可能出现这样的错误:

    Downloading https://pypi.python.org/packages/source/l/lxml/lxml-3.3.0beta2.tar.g
    z#md5=bd00423b358e4956ee1ce8bf4a308fa3
    Processing lxml-3.3.0beta2.tar.gz
    Running lxml-3.3.0beta2\setup.py -q bdist_egg --dist-dir c:\users\jgy\appdata\lo
    cal\temp\easy_install-wnkd_s\lxml-3.3.0beta2\egg-dist-tmp-n6djqa
    Building lxml version 3.3.0.beta2.
    Building without Cython.
    ERROR: 'xslt-config' 不是内部或外部命令,也不是可运行的程序
    或批处理文件。
    

      这是它想自己通过easy_install lxml的方式安装lxml,这样就会出现‘xslt-config’ 不是内部或外部命令这样的错误。手工执行easy_install lxml也会出现这样的错误,需要换一种方式安装lxml。

    可以使用安装.egg 文件的方式来安装:
    http://pypi.python.org/pypi/lxml/2.3/网站上下载对应的lxml .egg 文件
    然后使用下面的方式来安装:
    easy_install D:\Program Files\Python2.7\lxml-2.3.py2.7.win32.egg
    然后继续重新安装scrapy。

    验证安装:

    T:\>scrapy
    Scrapy 0.14.4 - no active project
    
    Usage:
      scrapy <command> [options] [args]
    
    Available commands:
      fetch         Fetch a URL using the Scrapy downloader
      runspider     Run a self-contained spider (without creating a project)
      settings      Get settings values
      shell         Interactive scraping console
      startproject  Create new project
      version       Print Scrapy version
      view          Open URL in browser, as seen by Scrapy
    
    Use "scrapy <command> -h" to see more info about a command
    
    T:\>

    我的网盘里分享了这些地址,可以一次性打包下载。点击这里下载

  • 相关阅读:
    css取消input、select默认样式(手机端)
    Vue解决sass-loader的版本过高导致的编译错误
    在Vue项目中引入element-ui,显示结果没有样式的问题
    修改Vue项目打开指定浏览器和修改端口号
    Vue如何关闭eslint
    停止事件冒泡
    子组件中定义的方法如何传给父组件调用了呢?
    vue自定义时间过滤器之使用date-fans代替moment
    面试5(每日打卡)
    2019.10.22
  • 原文地址:https://www.cnblogs.com/leonbond/p/3054902.html
Copyright © 2011-2022 走看看