zoukankan      html  css  js  c++  java
  • jupyter notebook搭建pyspark

     python3.6不支持pyspark, 好在用的是Anaconda这种神器,可以随意切换python版本。因为我的Spark是1.6的,所以python2.7应该是可以的。

    首先

    conda create -n py27 python=2.7 anaconda
    source activate py27
    conda install python=2.7
    就将当前的python环境切换到了2.7(其实这步不做也不影响Pyspark的运行), 然后修改

    /usr/local/share/jupyter/kernels/pyspark/kernel.json

    {
    "display_name": "PySpark",
    "language": "python",
    "argv": [ "/home/.../anaconda3/envs/py27/bin/python", "-m", "ipykernel", "-f", "{connection_file}" ],
    "env": {
    "SPARK_HOME": "/.../spark/spark-1.6.0-bin-hadoop2.6/",
    "PYSPARK_PYTHON": "/.../anaconda3/envs/py27/bin/python",
    "PYSPARK_DRIVER_PYTHON": "ipython2",                       
    "PYTHONPATH": "/.../spark/spark-1.6.0-bin-hadoop2.6/python/:/.../spark/spark-1.6.0-bin-hadoop2.6/python/lib/py4j-0.9-src.zip",
    "PYTHONSTARTUP": "/.../spark/spark-1.6.0-bin-hadoop2.6/python/pyspark/shell.py",
    "PYSPARK_SUBMIT_ARGS": "--name pyspark --master yarn --deploy-mode client --conf spark.executor.instances=8 --conf spark.executor.memory=5g --conf spark.driver.memory=8g --conf spark.sql.caseSensitive=false --conf spark.yarn.queue=queue1 pyspark-shell"
    }
    }

    注意该文件路径不要改,会有程序到这个路径直接读取。

    需要注意,既然pyspark用的是2.7的,那在涉及python环境的操作的时候要注意 :先切换到python2.7版本 (source activate py27),然后再操作,比如 conda install package

    参考链接:

    https://ipython.org/ipython-doc/3/notebook/public_server.html

    http://cleverowl.uk/2016/10/15/installing-jupyter-with-the-pyspark-and-r-kernels-for-spark-development/

    https://conda.io/docs/py2or3.html#use-a-different-version-of-python

    https://conda.io/docs/config.html

  • 相关阅读:
    网络基础
    python之bootstrap(组件,SweetAlert插件)
    python之网页小图标的使用
    python之bootstrap(基本)
    python之JQuery(批量操作,监听按键)
    python之JQuery(文档操作,事件委托,事件冒泡)
    python之JQuery(动画点赞实例)
    python之JQuery(hover,input值动态变化的实现)
    Less-1
    python100例 21-30
  • 原文地址:https://www.cnblogs.com/jiang-Xin/p/6651981.html
Copyright © 2011-2022 走看看