zoukankan      html  css  js  c++  java
  • 源码安装ipython,并在ipython中整合spark

    一、安装ipython

    下载ipython, https://pypi.python.org/packages/source/i/ipython/ipython-2.2.0.tar.gz#md5=b91d3724f655a8e16d022772f696cfd5

    cd /app/softwares/ipython
    tar -zxvf ipython-2.2.0.tar.gz
    cd ipython-2.2.0
    python2.7 setup.py install
    ln -s /usr/local/python2.7/bin/ipython /usr/bin/ipython
    

    二、配置ipython notebook

    ipython profile create nbserver
    cd ~/.ipython/profile_nbserver/
    
    openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mycert.pem -out mycert.pem
    

    在出现的提示中进行信息填写:

    Country Name (2 letter code) [XX]:CN
    State or Province Name (full name) []:Guangdong
    Locality Name (eg, city) [Default City]:Shenzhen
    Organization Name (eg, company) [Default Company Ltd]:*
    Organizational Unit Name (eg, section) []:ShuJuPingTaiBu
    Common Name (eg, your name or your server's hostname) []:*
    Email Address []:*
    

    生成加密的密码:

    python2.7 -c "import IPython;print IPython.lib.passwd()"
    
    Enter password:
    Verify password:
    sha1:5ba5d1a5aa4f:6edaa277f374497b1d026b799b473b3ef7f8c636

    ipython profile create nbserver

    vi2 ipython_notebook_config.py

    # This starts plotting support always with matplotlib
    c.IPKernelApp.pylab = 'inline'
    
    # You must give the path to the certificate file.
    
    # If using a Linux VM:
    c.NotebookApp.certfile = u'/root/.ipython/profile_nbserver/mycert.pem'
    
    # Create your own password as indicated above
    c.NotebookApp.password = u'sha1:5ba5d1a5aa4f:6edaa277f374497b1d026b799b473b3ef7f8c636'
    
    # Network and browser details. We use a fixed port (9999) so it matches
    # our Windows Azure setup, where we've allowed traffic on that port
    
    c.NotebookApp.ip = '*'
    c.NotebookApp.port = 9999
    c.NotebookApp.open_browser = False
    

    启动ipython notebook server

     ipython notebook --profile=nbserver
    

    安装pyzmq,需要安装zeromq 

    下载zeromq,http://download.zeromq.org/zeromq-4.0.4.tar.gz

    ./configure
    make && make install
    

    下载pyzmq,https://pypi.python.org/packages/source/p/pyzmq/pyzmq-14.3.1.tar.gz#md5=7196b4a6fbf98022f17ffa924be3d68d

    ln -s /usr/local/lib/libzmq.so.3 /usr/local/include/
    python2.7 setup.py install --zmq=/usr/local/

    安装Jinja2, 需要安装distribute

    下载jinja2,https://pypi.python.org/packages/source/J/Jinja2/Jinja2-2.7.3.tar.gz

    python2.7 setup.py install
    

    下载distribute,https://pypi.python.org/packages/source/d/distribute/distribute-0.7.3.zip#md5=c6c59594a7b180af57af8a0cc0cf5b4a

    python2.7 setup.py install
    

    安装makeupsafe,https://pypi.python.org/packages/source/M/MarkupSafe/MarkupSafe-0.23.tar.gz

    python2.7 setup.py install
    

    安装tornado,需要安装backports.ssl_match_hostname和certifi

    https://pypi.python.org/packages/source/t/tornado/tornado-4.0.2.tar.gz
    https://pypi.python.org/packages/source/b/backports.ssl_match_hostname/backports.ssl_match_hostname-3.4.0.2.tar.gz
    https://pypi.python.org/packages/source/c/certifi/certifi-14.05.14.tar.gz


    安装sqlite3

    http://blog.csdn.net/gl1987807/article/details/7253021
    安装 sqlite-devel.x86_64

    yum install sqlite-devel.x86_64

    安装sqlite-devel之后,仍然报sqlite3模块不存在的问题,解决该问题,参考http://stackoverflow.com/questions/1210664/no-module-named-sqlite3
    重新编译python2.7.5

    cp /app/softwares/python/Python-2.7.5/build/lib.linux-x86_64-2.7/_sqlite3.so /usr/local/python2.7/lib/python2.7/sqlite3/
    

    安装MathJax,https://github.com/mathjax/MathJax/archive/2.4.0.tar.gz

    cd /app/softwares/ipython
    python2.7 -m IPython.external.mathjax MathJax-2.4.0.tar.gz
    

    测试 ipython notebook使用,参考示例:http://www.cnblogs.com/cbscan/p/3545084.html

    from IPython.display import Latex
    Latex(r"$sqrt{x^2+y^2}$")
    
    Out[1]:
    $sqrt{x^2+y^2}$
    
    %load_ext sympyprinting
    from sympy import *
    x, y = symbols("x,y")
    sqrt(x**2+y**2)
    
    ImportError: No module named sympy 

    下载安装sympy,https://pypi.python.org/packages/source/s/sympy/sympy-0.7.5.tar.gz

    from sympy import init_printing ;
    init_printing()
    from sympy import *
    x, y = symbols("x,y")
    sqrt(x**2+y**2)
    
    Out[7]:
    $$sqrt{x^{2} + y^{2}}$$
    
    %pylab inline
    
    plot(random.randn(100));
    
    ImportError: No module named matplotlib
    

    下载安装matplotlib,https://pypi.python.org/packages/source/m/matplotlib/matplotlib-1.4.0.tar.gz#md5=1daf7f2123d94745feac1a30b210940c

    安装新版freetype,http://download.savannah.gnu.org/releases/freetype/freetype-2.5.3.tar.gz

    安装新版numpy
    https://pypi.python.org/packages/source/n/numpy/numpy-1.9.0.tar.gz#md5=510cee1c6a131e0a9eb759aa2cc62609

    https://pypi.python.org/packages/source/m/mock/mock-1.0.1.tar.gz#md5=c3971991738caa55ec7c356bbc154ee2

    https://pypi.python.org/packages/source/n/nose/nose-1.3.4.tar.gz#md5=6ed7169887580ddc9a8e16048d38274d

    https://pypi.python.org/packages/source/p/pyparsing/pyparsing-2.0.2.tar.gz#md5=b170c5d153d190df1a536988d88e95c1

    https://pypi.python.org/packages/source/p/python-dateutil/python-dateutil-2.2.tar.gz#md5=c1f654d0ff7e33999380a8ba9783fd5c

    https://pypi.python.org/packages/source/s/six/six-1.8.0.tar.gz#md5=1626eb24cc889110c38f7e786ec69885


    三、在ipython notebook中整合spark

    在/etc/profile中添加

    export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
    export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.8.2.1-src.zip:$PYTHONPATH
    export PYSPARK_PYTHON=python2.7
    

    在python中测试:

    >>> from pyspark import SparkConf, SparkContext
    >>> conf = SparkConf().setMaster("spark://ip:19002").setAppName("pyspark")
    >>> sc = SparkContext(conf = conf)
    >>> data = [1, 2, 3, 4, 5]
    >>> distData = sc.parallelize(data, 1)
    >>> distData
    ParallelCollectionRDD[0] at parallelize at PythonRDD.scala:315
    >>> distData.count()
    >>> distData.first()
    

    另外,也可以使用下述命令导入spark模块,并初始化SparkContext

    execfile("/app/spark/python/pyspark/shell.py")
    

    之后可直接使用sc,测试如下:

    file = sc.textFile("/tmp/test_spark/input")
    data = file.flatMap(lambda line: line.split(" "))
    data.collect()

      

  • 相关阅读:
    如何使用SAP Intelligent Robotic Process Automation自动操作Excel
    OpenSAML 使用引导 IV: 安全特性
    Spring Cloud Zuul 网关使用与 OAuth2.0 认证授权服务
    微服务架构集大成者—Spring Cloud (转载)
    Spring Cloud Eureka 服务注册列表显示 IP 配置问题
    使用 Notification API 开启浏览器桌面提醒
    SignalR 中使用 MessagePack 序列化提高 WebSocket 通信性能
    配置 Nginx 的目录浏览功能
    关于 Nginx 配置 WebSocket 400 问题
    Migrate from ASP.NET Core 2.0 to 2.1
  • 原文地址:https://www.cnblogs.com/Cherise/p/4351022.html
Copyright © 2011-2022 走看看