zoukankan      html  css  js  c++  java
  • 2.Spark 2.x 集群部署和测试

    配置免密度登录

    执行 ssh-keygen -t rsa
    #建立 ssh 目录,一路敲回车, 生成的密钥对 id_rsaid_rsa.pub
    默认存储在~/.ssh 目录下

    chmod 755 .ssh #赋予 755 权限
    cd .ssh
    #ls – l
    id_rsa id_rsa.pub

    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    把公用密匙添加到 authorized_keys 文件中(此文件最后一定要赋予 644 权限)

    现在给slave1节点设置公钥

    执行 ssh-keygen -t rsa
    #建立 ssh 目录,一路敲回车, 生成的密钥对 id_rsa, id_rsa.pub
    默认存储在~/.ssh 目录下 

    chmod 755 .ssh #赋予 755 权限
    cd .ssh
    #ls – l
    id_rsa id_rsa.pub

    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    把公用密匙添加到 authorized_keys 文件中(此文件最后一定要赋予 644 权限)

    ssh slave1 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

    有几个 slave 节点就需要运行几次命令, slave1 是节点名称
    scp ~/.ssh/authorized_keys slave1:~/.ssh/
    #把 authorized_keys 文件拷贝回每一个节点, slave1 是节点名称 

     

    可以看到能相互之间实现了免密码登录。

    解压 Scala Spark

    1、 删除 cdh 中的 Spark
    rm -rf /usr/bin/spark*
    rm -rf /etc/spark


    2、 上传至 spark-2.0.0-preview-bin-hadoop2.6.tgz scala-2.11.8.tgz /opt/soft/spark2.0 下, 并进行解压

    tar -zxf scala-2.11.8.tgz
    tar -zxf spark-2.0.0-bin-hadoop2.6.tgz 

     

    vi /etc/profile, 增加如下内容:

     

    export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
    export SPARK_HOME=/opt/soft/spark2.0/spark-2.0.0-bin-hadoop2.6
    export SCALA_HOME=/opt/soft/spark2.0/scala-2.11.8
    export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
    export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    export PATH=$PATH:$JAVA_HOME/bin:$SPARK_HOME/bin:$HADOOP_HOME=/bin:$SCALA_HOME/bin
    export HADOOP_CONF_DIR=/etc/hadoop/conf

     

    source /etc/profile 起效
    两个节点都配置环境变量

    3、 修改 SPARK_HOME/conf
    mv slaves.template slaves slaves 里配置工作节点主机名列表

     

     

    mv spark-env.sh.template spark-env.sh , spark-env.sh 配置一些环境变量, 由于我们用 Yarn 模式,
    这里面不用配置

     

    4、 运行测试
    2.0 之前, Spark YARN 中有 yarn-cluster yarn-client 两种运行模式, 建议前者。
    而在 2.0 --master yarn-cluster yarn-client deprecated 了, 统一用 yarn

    run-example 方便测试环境:
    run-example SparkPi               local 模式运行

     

    分布式模式运行:
    spark-submit --class org.apache.spark.examples.SparkPi 
    --master yarn 
    --num-executors 1 
    --driver-memory 1g 
    --executor-memory 1g 
    --executor-cores 1 
    --conf "spark.app.name=SparkPi" 
    /opt/soft/spark2.0/spark-2.0.0-bin-hadoop2.6/examples/jars/spark-examples_2.11-2.0.0.jar

    可以看到报错了

    内存不足, 报错的话, 在 cm 里进行 yarn 的配置, 如下 2 个设置为 2g:
    yarn.scheduler.maximum-allocation-mb
    yarn.nodemanager.resource.memory-mb

    搜索

    部署客户端配置的作用: 把 cm 界面里修改过的参数同步到每个节点的 xml 配置文件里。
    然后重启 Yarn 服务起效

    再次运行

    19/12/11 04:50:11 INFO server.ServerConnector: Stopped ServerConnector@3eac12e0{HTTP/1.1}{0.0.0.0:4041}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1994ddfb{/stages/stage/kill,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3bf9fe08{/api,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@29f78fae{/,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@37204e58{/static,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4183d912{/executors/threadDump/json,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3ef0642{/executors/threadDump,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1a25417f{/executors/json,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@496168c4{/executors,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4902985e{/environment/json,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@5742691b{/environment,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6f0b6d81{/storage/rdd/json,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3d2b710e{/storage/rdd,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@424ace42{/storage/json,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@23e5c67f{/storage,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4c254927{/stages/pool/json,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@5e2acdad{/stages/pool,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7775e958{/stages/stage/json,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7fc94782{/stages/stage,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@12859445{/stages/json,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@66ff8927{/stages,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6da6e106{/jobs/job/json,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6d6e70aa{/jobs/job,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@45cdfe2{/jobs/json,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@639bb83d{/jobs,null,UNAVAILABLE}
    19/12/11 04:50:11 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.199.130:4041
    19/12/11 04:50:11 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
    19/12/11 04:50:11 INFO cluster.YarnClientSchedulerBackend: Stopped
    19/12/11 04:50:11 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
    19/12/11 04:50:11 INFO memory.MemoryStore: MemoryStore cleared
    19/12/11 04:50:11 INFO storage.BlockManager: BlockManager stopped
    19/12/11 04:50:11 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
    19/12/11 04:50:11 WARN metrics.MetricsSystem: Stopping a MetricsSystem that is not running
    19/12/11 04:50:11 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
    19/12/11 04:50:11 INFO spark.SparkContext: Successfully stopped SparkContext
    Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
            at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)
            at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260)
            at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:240)
            at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:162)
            at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
            at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3885)
            at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3868)
            at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:3850)
            at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6826)
            at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4562)
            at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4532)
            at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4505)
            at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:884)
            at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:328)
            at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:641)
            at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
            at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
            at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
            at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
            at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:415)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
            at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)
    
            at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
            at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
            at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
            at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
            at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
            at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
            at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2744)
            at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2713)
            at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
            at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
            at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
            at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:866)
            at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:859)
            at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1817)
            at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:597)
            at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:385)
            at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:834)
            at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:167)
            at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
            at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:149)
            at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
            at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2256)
            at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831)
            at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823)
            at scala.Option.getOrElse(Option.scala:121)
            at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823)
            at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
            at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
            at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
            at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
            at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
            at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
            at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)
            at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260)
            at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:240)
            at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:162)
            at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
            at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3885)
            at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3868)
            at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:3850)
            at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6826)
            at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4562)
            at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4532)
            at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4505)
            at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:884)
            at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:328)
            at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:641)
            at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
            at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
            at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
            at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
            at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:415)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
            at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)
    
            at org.apache.hadoop.ipc.Client.call(Client.java:1469)
            at org.apache.hadoop.ipc.Client.call(Client.java:1400)
            at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
            at com.sun.proxy.$Proxy16.mkdirs(Unknown Source)
            at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:539)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
            at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
            at com.sun.proxy.$Proxy17.mkdirs(Unknown Source)
            at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2742)
            ... 30 more
    19/12/11 04:50:11 INFO util.ShutdownHookManager: Shutdown hook called
    19/12/11 04:50:11 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-b384a292-7755-4080-a42d-29b44ad13b95

    这个问题这样解决

    去掉这个沟,默认是选上的,把它去掉了就可以

    再重启一下hdfs

    再次运行

    可以看到成功了!!!

  • 相关阅读:
    驱动开发之基本
    Bitmap文件格式+生成一个BMP文件
    PPP 转义字符 编码 和 解码
    数组数据整体按位左移或右移一位
    一个assert的写法
    c++11 右值引用 && std::move()
    openMP一小时初探
    linux命令学习_实验楼(一)
    50 行 Python 代码完成图片转字符
    LFW精确度验证__c++双线程读写txt
  • 原文地址:https://www.cnblogs.com/braveym/p/12024024.html
Copyright © 2011-2022 走看看