Local模式运行Spark

zoukankan html css js c++ java

Local模式运行Spark
1. 下载Scala，并设置SCALA_HOME和PATH。
2. 下载Hadoop，并设置HADOOP_HOME和PATH。我下载的是 hadoop2.6_Win_x64-master。主要用到了winutils.exe 这个工具
3. 下载Spark，主要是为了得到 spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar
4. 下载scala-SDK，设置JRE Library和Scala Library
5. 新建 Scala Project，并在工程的Libraries引入 spark-assembly-1.6.0-hadoop2.6.0.jar
6. val conf = new SparkConf().setMaster("local").setAppName("FileWordCount"); 即可在本地运行Spark应用。　　
　　也可以在spark-submit 里通过 --master 参数来指定为本地模式：

local
Run Spark locally with one worker thread (i.e. no parallelism at all).

local[K]
Run Spark locally with K worker threads (ideally, set this to the number of cores on your machine).

local[*]
Run Spark locally with as many worker threads as logical cores on your machine.

spark://HOST:PORT
Connect to the given Spark standalone cluster master. The port must be whichever one your master is configured to use, which is 7077 by default.

mesos://HOST:PORT
Connect to the given Mesos cluster. The port must be whichever one your is configured to use, which is 5050 by default. Or, for a Mesos cluster using ZooKeeper, use mesos://zk://.... To submit with --deploy-mode cluster, the HOST:PORT should be configured to connect to the MesosClusterDispatcher.

yarn
Connect to a YARN cluster in client or cluster mode depending on the value of --deploy-mode. The cluster location will be found based on the HADOOP_CONF_DIR or YARN_CONF_DIR variable.

请参考：

http://spark.apache.org/docs/latest/submitting-applications.html
查看全文

相关阅读:
二十几岁的事情之一，缩小理想！
mysql数据库建表分类字段--尽量少用字符串--原因探索
 Linux 提升逼格之命令别名分享
 git 认证问题之一的解决： http ssh 互换
 rabbitmqctl 命令整理
 golang 用defer 捕获error 需小心
 Linux 常用命令随口说
 随便记录几个点
 喜鹊开发者（The Magpie Developer）
主流开放平台接口说明

原文地址：https://www.cnblogs.com/machong/p/5806284.html