zoukankan      html  css  js  c++  java
  • Spark standalone安装(最小化集群部署)

    Spark standalone安装-最小化集群部署(Spark官方建议使用Standalone模式)
        
        集群规划:
        主机        IP                    软件      进程
        sc1        192.168.1.61    spark    Master、Worker
        sc2        192.168.1.62    spark    Worker
        sc3        192.168.1.63    spark    Worker
        
        1、建议Spark的Worker节点和Hadoop的DataNode节点部署在同节点(有内存竞争问题,需配置好Spark和Hadoop的内存使用比例)。
        2、在sc1上安装Spark1.4.1-bin-hadoop2.tgz
            2.1:上传Spark1.4.1-bin-hadoop2.tgz到sc1节点的/usr/local/soft目录下
                使用WinSCP上传spark-1.4.1-bin-hadoop2.6.tgz到sc1节点的/usr/local/soft目录下;
            2.2:解压Spark1.4.1-bin-hadoop2.tgz到sc1节点的/usr/local/installs/目录下
                cd /usr/local/soft
                tar -zxvf Spark1.4.1-bin-hadoop2.tgz -C /usr/local/installs/
                cd ../installs/
            2.3:重命名解压出来的spark-1.4.1-bin-hadoop2.6为spark141-hadoop26
                mv spark-1.4.1-bin-hadoop2.6 spark141-hadoop26
            2.4:修改spark的配置文件(spark-env.sh、slaves)
                cd /usr/local/installs/spark141-hadoop26/conf
                cp spark-env.sh.template spark-env.sh
                cp slaves.template slaves
                vim slaves
                    sc1
                    sc2
                    sc3
                vim spark-env.sh
                    export SPARK_MASTER_IP=sc1
                    export JAVA_HOME=/usr/local/installs/java
            2.5:分发配置好的spark到sc2、sc3节点上
                scp -rq /usr/local/installs/spark141-hadoop26/ sc2:/usr/local/installs/
                scp -rq /usr/local/installs/spark141-hadoop26/ sc3:/usr/local/installs/
            2.6:启动spark集群
                /usr/local/installs/spark141-hadoop26/sbin/start-all.sh
                启动时输出如下内容
                    [root@sc1 spark141-hadoop26]# sbin/start-all.sh
                    starting org.apache.spark.deploy.master.Master, logging to /usr/local/installs/spark141-hadoop26/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-sc1.out
                    sc3: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/installs/spark141-hadoop26/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-sc3.out
                    sc2: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/installs/spark141-hadoop26/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-sc2.out
                    sc1: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/installs/spark141-hadoop26/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-sc1.out
            2.7:查看Spark集群启动状态(查看进程方式、Web方式)
                查看启动的进程
                    for i in sc1 sc2 sc3; do echo $i; ssh $i `which jps`; done
                        sc1
                        2401 Worker
                        2256 Master
                        2497 Jps
                        sc2
                        5692 Jps
                        5619 Worker
                        sc3
                        5610 Worker
                        5681 Jps
                Web方式查看
                    http://sc1:8080/
        3、运行spark-shell
            创建Spark的rdd
            var rdd1 = sc.textFile("/usr/local/installs/spark_rdd1")
            rdd1.collect
           

  • 相关阅读:
    余弦定理和新闻的分类
    关于复旦大学自然语言处理实验室的基准语料
    Where name like “MySQL%” ===> Where name>=”MySQL”and name<”MySQM”
    xcode4.2 IOS5 编译低于sdk4.3的程序的办法(转)
    CGContextRef相关的操作
    IOS5修改UIToolBar 和UINavigationBar 的背景颜色
    LLVM是个什么东东
    oneway在Xcode4的使用
    XCode4 App Store提交小结(转)
    怎么样才能方便的隐藏键盘
  • 原文地址:https://www.cnblogs.com/mengyao/p/4656867.html
Copyright © 2011-2022 走看看