zoukankan      html  css  js  c++  java
  • storm on yarn安装时 提交到yarn失败 failed

       最近在部署storm on yarn ,部署参考文章

    http://www.tuicool.com/articles/BFr2Yv
    http://blog.csdn.net/jiushuai/article/details/18729367


    在安装完zookeeper,配置好storm 和storm on yarn后,启动zookeeper,其中zookeeper的port为2181,
    然后通过mvn package 编译工程,发现会出现错误,然后使用mvn packet -DskipTests 重新编译,跳过test
    然后向yarn 提交storm任务,storm-yarn launch <path to your storm.yaml file>

    提交后查看localhost:8088,发现任务failed,查看错误信息发现错误如下
    Application application_1411179375629_0005 failed 2 times due to AM Container for appattempt_1411179375629_0005_000002 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
    org.apache.hadoop.util.Shell$ExitCodeException:
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
    at org.apache.hadoop.util.Shell.run(Shell.java:418)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
    Container exited with a non-zero exit code 1
    .Failing this attempt.. Failing the application. 

    查看log信息发现具体错误如下

    14/09/19 20:44:55 INFO yarn.MasterServer: Starting Master Thrift Server
    14/09/19 20:44:55 ERROR auth.ThriftServer: ThriftServer is being stopped due to: org.apache.thrift7.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:9000.
    org.apache.thrift7.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:9000.
            at org.apache.thrift7.transport.TNonblockingServerSocket.<init>(TNonblockingServerSocket.java:89)
            at org.apache.thrift7.transport.TNonblockingServerSocket.<init>(TNonblockingServerSocket.java:68)
            at org.apache.thrift7.transport.TNonblockingServerSocket.<init>(TNonblockingServerSocket.java:61)
            at backtype.storm.security.auth.SimpleTransportPlugin.getServer(SimpleTransportPlugin.java:47)
            at backtype.storm.security.auth.ThriftServer.serve(ThriftServer.java:52)
            at com.yahoo.storm.yarn.MasterServer.main(MasterServer.java:175)

     在启动Master Thrift Server时发生错误,错误为端口错误,因为9000端口已经被hdfs占用监听,因此不能创建ServerSocket

    致谢google,搜到的解决办法为修改端口号,帖子如下

    https://groups.google.com/forum/#!topic/storm-yarn/A1ds1M6qmN8

    修改storm-yarn-master/src/main/resources/master_defaults.yaml,将其中的master.thrift.port修改为一个其他的合适的值,我修改为9001

    然后再编译工程,重新提交,这时发现任务没有failed,但是访问localhost:7070,不能访问,查看log后发现错误为nimbus没有启动成功,错误如下:

    15/07/15 03:36:06 ERROR yarn.MasterServer: Unhandled error in AM: 
    org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested virtual cores < 0, or requested virtual cores > max configured, requestedVirtualCores=130, maxVirtualCores=8
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:213)
        at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.validateResourceRequests(RMServerUtils.java:97)
        at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:502)
        ... ... ... ... ... ...
        at com.yahoo.storm.yarn.MasterServer$1.run(MasterServer.java:69)
    Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException): Invalid resource request, requested virtual cores < 0, or requested virtual cores > max configured, requestedVirtualCores=130, maxVirtualCores=8
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:213)
        at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.validateResourceRequests(RMServerUtils.java:97)
        ... ... ... ... ... ...
     at com.sun.proxy.$Proxy7.allocate(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77) ... 9 more 15/07/15 03:36:06 INFO yarn.StormMasterServerHandler: stopping supervisors... 15/07/15 03:36:06 INFO yarn.StormMasterServerHandler: stopping UI... 15/07/15 03:36:06 INFO yarn.StormMasterServerHandler: stopping nimbus...

    意思是申请的virtual cores超过最大限制maxVirtualCores,google后找到帖子

    hadoop - Why cannot more than 32 cores be requested from YARN to run a job? - Stack Overflow
    http://stackoverflow.com/questions/29780401/why-cannot-more-than-32-cores-be-requested-from-yarn-to-run-a-job

    因此修改yarn-site.xml如下

    <property>
      <name>yarn.nodemanager.resource.cpu-vcores</name>
      <value>8</value>
    </property>

    重新启动yarn,然后提交,成功!!!

    如果发现以下错误

    2015-07-15 03:19:28 o.a.z.ClientCnxn [INFO] Opening socket connection to server MMC/192.168.1.200:2181
    2015-07-15 03:19:28 o.a.z.ClientCnxn [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
    java.net.ConnectException: 拒绝连接

    有可能是zookeeper没有启动,启动即可

    时刻注意自己的防火墙有没有关闭,有些不知名的原因是因为防火墙没有关闭造成的!

    用了三四天时间才解决了这个问题,期间走了不少弯路,因为刚开始接触hadoop storm on yarn,因此有时候出错误了不知道怎么去查错,开始时只是自己估计是哪出了问题,然后改一下重新跑一下,结果还是不行,后来学会了去查错误log,在masterhost:8088/logs/下就是所有的错误日志,然后查找相关的错误日志排错事半功倍

    致谢:google

    吐槽一下万恶的网G络F审W查¥制#度,我操我操我操我操我操!!!Internet上的闭关锁国!!!

     
  • 相关阅读:
    一个C#读写Dxf的类库DXFLibrary
    我的敏捷之路
    C#+GDAL读写文件
    IIS并发连接数和数据库连接池
    .net网站iis应用池完美解决方案
    超时时间已到。超时时间已到,但是尚未从池中获取连接。出现这种情况可能是因为所有池连接均在使用,并且达到了最大池大小。
    C#代码连接Oracle数据库一段时间以后[connection lost contact]的问题
    C#程序以管理员权限运行
    C#流总结(文件流、内存流、网络流、BufferedStream、StreamReader/StreamWriter、TextReader/TextWriter)
    Redis连接的客户端(connected_clients)数过高或者不减的问题解决方案
  • 原文地址:https://www.cnblogs.com/prisoner/p/4647461.html
Copyright © 2011-2022 走看看