zoukankan      html  css  js  c++  java
  • Using Spark's "Hadoop Free" Build

    Spark uses Hadoop client libraries for HDFS and YARN. Starting in version Spark 1.4, the project packages “Hadoop free” builds that lets you more easily connect a single Spark binary to any Hadoop version. To use these builds, you need to modify SPARK_DIST_CLASSPATH to include Hadoop’s package jars. The most convenient place to do this is by adding an entry in conf/spark-env.sh.

    This page describes how to connect Spark to Hadoop for different types of distributions.

    Spark使用HDFS和YARN的Hadoop客户端库。 从Spark 1.4版本开始,项目包“Hadoop free”构建,可让您更轻松地将单个Spark二进制文件连接到任何Hadoop版本。 要使用这些构建,您需要修改SPARK_DIST_CLASSPATH以包含Hadoop的包jar。 最方便的地方是在conf / spark-env.sh中添加一个条目。

    本页介绍如何将Spark连接到Hadoop以用于不同类型的分发。

    Apache Hadoop

    For Apache distributions, you can use Hadoop’s ‘classpath’ command. For instance:

     1 ### in conf/spark-env.sh ###
     2 
     3 # If 'hadoop' binary is on your PATH
     4 export SPARK_DIST_CLASSPATH=$(hadoop classpath)
     5 
     6 # With explicit path to 'hadoop' binary
     7 export SPARK_DIST_CLASSPATH=$(/path/to/hadoop/bin/hadoop classpath)
     8 
     9 # Passing a Hadoop configuration directory
    10 export SPARK_DIST_CLASSPATH=$(hadoop --config /path/to/configs classpath)
  • 相关阅读:
    学到的一种编程风格
    防止忘记初始化NSMutableArray的方法
    == 和 isEqualToString的区别之备忘
    IOS开发中--点击imageView上的Button没有任何反应
    [__NSCFString absoluteURL]错误的解决方案
    二叉树镜像
    判断树的子结构
    算法练习-贪心问题
    重建二叉树:
    IDEA搭建web项目出现中文乱码问题
  • 原文地址:https://www.cnblogs.com/hmy-blog/p/7837257.html
Copyright © 2011-2022 走看看