zoukankan      html  css  js  c++  java
  • ubuntu下pyspark的安装

    1.安装jkd1.8(这里不再描述)

    2.直接在终端输入pip install pyspark(官网提供的最简单的一种安装方式)

    过程如下:

    Collecting pyspark
      Downloading https://files.pythonhosted.org/packages/ee/2f/709df6e8dc00624689aa0a11c7a4c06061a7d00037e370584b9f011df44c/pyspark-2.3.1.tar.gz (211.9MB)
        100% |████████████████████████████████| 211.9MB 8.3kB/s 
    Requirement already satisfied: py4j==0.10.7 in ./anaconda3/lib/python3.6/site-packages (from pyspark)
    Building wheels for collected packages: pyspark
      Running setup.py bdist_wheel for pyspark ... done
      Stored in directory: /home/tan/.cache/pip/wheels/37/48/54/f1b63f0dbb729e20c92f1bbcf1c53c03b300e0b93ca1781526
    Successfully built pyspark
    Installing collected packages: pyspark
    Successfully installed pyspark-2.3.1

    安装完成后, 终端输入pyspark,启动pyspark出错......

    tan@tan-Precision-Tower-3620:~$ pyspark
    JAVA_HOME is not set

    解决方法:

    找到pyspark的安装路径

    tan@tan-Precision-Tower-3620:~$ pip install pyspark
    Requirement already satisfied: pyspark in ./anaconda3/lib/python3.6/site-packages
    Requirement already satisfied: py4j==0.10.7 in ./anaconda3/lib/python3.6/site-packages (from pyspark)

    找到路径后,在load-spark-env.sh文件中加上jdk的安装路径即可

    export JAVA_HOME=/home/tan/jdk1.8.0_181

    保存之后, 再次在终端输入pyspark, 成功启动pyspark

    tan@tan-Precision-Tower-3620:~$ pyspark
    Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19) 
    [GCC 7.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    2018-07-29 12:37:48 WARN  Utils:66 - Your hostname, tan-Precision-Tower-3620 resolves to a loopback address: 127.0.1.1; using 192.168.0.100 instead (on interface enp0s31f6)
    2018-07-29 12:37:48 WARN  Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
    2018-07-29 12:37:48 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _ / _ / _ `/ __/  '_/
       /__ / .__/\_,_/_/ /_/\_   version 2.3.1
          /_/
    
    Using Python version 3.6.4 (default, Jan 16 2018 18:10:19)
    SparkSession available as 'spark'.
    >>> 

    完结

  • 相关阅读:
    Unable to connect to Command Metric Stream 'api/turbine/stream?cluster=REPORTS' Error: {"isTrusted"}
    Spring Boot Admin 1.5.7
    Druid Spring Boot Starter
    09: mysql基础面试题
    08: mysql主从原理
    07: mysql锁和事物隔离
    06: mysql索引查找原理及调优
    05: MySQL高级查询
    04: Mysql性能优化
    02: MySQL的安装与基本配置
  • 原文地址:https://www.cnblogs.com/tsdblogs/p/9384991.html
Copyright © 2011-2022 走看看