zoukankan      html  css  js  c++  java
  • Centos 安装Python Scrapy PhantomJS

    安装依赖:

    • yum install libxslt-devel libffi libffi-devel python-devel gcc openssl openssl-devel sqlite-devel

    安装Python2.7或以上的版本(如果多版本共存则必须加prefix)

    • wget http://python.org/ftp/python/2.7.2/Python-2.7.2.tgz
    • tar xvf Python-2.7.3.tgz
    • cd Python-2.7.3
    • ./configure --prefix=/usr/local/python27 
    • make && make install

    安装setuptools和pip(可能需要添加PATH或者设置软链接)

    • wget -q http://peak.telecommunity.com/dist/ez_setup.py
    • python ez_setup.py
    • easy_install pip

    或者

    • wget -q https://bootstrap.pypa.io/get-pip.py
    • python get-pip.py

    或者

    • wget https://files.pythonhosted.org/packages/66/6d/dad0d39ce1cfa98ef3634463926e7324e342c956aecb066968e2e3696300/setuptools-30.0.0.tar.gz
    • tar -xvf setuptools-30.0.0.tar.gz
    • cd setuptools-30.0.0
    • python setup.py install
    • cd ..
    • wget https://files.pythonhosted.org/packages/5e/53/eaef47e5e2f75677c9de0737acc84b659b78a71c4086f424f55346a341b5/pip-9.0.0.tar.gz
    • tar -xvf pip-9.0.0.tar.gz
    • cd pip-9.0.0
    • python setup.py install

    安装Twisted(可能需要添加PATH或者设置软链接)

    • easy_install Twisted
    • 可能Twisted版本过高或过低导致最后报错,可以用pip指定版本,多试几次
    • pip install twisted==12.5.0

    安装w3lib

    • easy_install -U w3lib

     安装lxml

    • easy_install lxml

    安装pyOpenSSL

    • easy_install pyOpenSSL
    • 如果不行则手动下载安装
    • wget http://launchpadlibrarian.net/58498441/pyOpenSSL-0.11.tar.gz
    • tar zxvf pyOpenSSL-0.11.tar.gz
    • cd pyOpenSSL
    • python2.7 setup.py install

    安装Scrapy(可能需要添加PATH或者设置软链接)

    • easy_install -U Scrapy

    安装Selenium(如果需要解析动态网页)

    • pip install selenium

    安装PhantomJS(如果需要解析动态网页)

    • wget https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-linux-x86_64.tar.bz2
    • bzip2 -d phantomjs-2.1.1-linux-x86_64.tar.bz2
    • tar xvf phantomjs-2.1.1-linux-x86_64.tar -C /usr/local/
    • yum -y install wget fontconfig
    • mv /usr/local/phantomjs-2.1.1-linux-x86_64/ /usr/local/phantomjs
    • ln -s /usr/local/phantomjs/bin/phantomjs /usr/bin/

    Scrapy测试

    • scrapy shell www.baidu.com

    Selenium和PhantomJS测试

    from selenium import webdriver
    driver = webdriver.PhantomJS()
    driver.get("http://hotel.qunar.com/")
    data = driver.title
    print data

    参考文献:

    http://www.cnblogs.com/xiaoruoen/archive/2013/02/27/2933854.html

    http://blog.csdn.net/diaoruiqing/article/details/8700533

    http://blog.csdn.net/liuxiao723846/article/details/51477266

    http://www.linuxidc.com/Linux/2016-11/137668.htm

    https://stackoverflow.com/questions/42731760/attributeerror-module-object-has-no-attribute-op-no-tlsv1-1/43220861

    http://www.cnblogs.com/zengguowang/p/6911812.html

    http://blog.csdn.net/feifeilyj/article/details/52678011

    http://www.cnblogs.com/zzhzhao/p/5380376.html

    http://www.cnblogs.com/luxiaojun/p/6144748.html

  • 相关阅读:
    475. Heaters
    69. Sqrt(x)
    83. Remove Duplicates from Sorted List Java solutions
    206. Reverse Linked List java solutions
    100. Same Tree Java Solutions
    1. Two Sum Java Solutions
    9. Palindrome Number Java Solutions
    112. Path Sum Java Solutin
    190. Reverse Bits Java Solutin
    202. Happy Number Java Solutin
  • 原文地址:https://www.cnblogs.com/jhc888007/p/7463714.html
Copyright © 2011-2022 走看看