zoukankan      html  css  js  c++  java
  • oozie JAVA Client 编程提交作业

    1,eclipse环境搭建

    在eclipse中新建一个JAVA工程,导入必要的依赖包,目前用到的有:


    其次编写JAVA 程序提交Oozie作业,这里可参考:oozie官方参考文档

    在运行提交程序前,首先需要把相应的程序打成jar包,定义好workflow.xml,再把它们上传到HDFS中。然后在程序中指定作业的属性,这里我是直接用的oozie-examples.tar.gz中的示例。

    部分代码参考如下:

     1 OozieClient wc = new OozieClient("http://192.168.121.35:11000/oozie");
     2         
     3         //create workflow job configuration 
     4         Properties conf = wc.createConfiguration();
     5         conf.setProperty(OozieClient.APP_PATH, "hdfs://datanode1:8020/user/cdhfive/examples/apps/map-reduce");
     6         
     7         //set a workflow parameters
     8         conf.setProperty("nameNode", "hdfs://datanode1:8020");
     9         conf.setProperty("jobTracker", "datanode1:8032");
    10         conf.setProperty("inputDir", "/user/cdhfive/examples/input-data");
    11 //        conf.setProperty("outputDir", "hdfs://192.168.121.35:8020/user/cdhfive/examples/output-data");
    12         conf.setProperty("outputDir", "/user/cdhfive/examples/output-data");
    13         conf.setProperty("queueName", "default");
    14         conf.setProperty("examplesRoot", "examples");
    15         conf.setProperty("user.name", "cdhfive");

    在代码中workflow的参数时需要注意以下几点:

    ①在workflow.xml中定义的变量需要在程序中进行设置。如workflow.xml中的 ${jobTracker},则在JAVA程序中需要用语句:

    conf.setProperty("jobTracker", "datanode1:8032");设置好。并且value 值要符合相应的格式。

    2,作业提交过程中碰到的一些问题及解决:

    ⓐError starting action [mr-node]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Permission denied: user=hapjin, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x

    由于我在本地windows系统上的用户hapjin运行的eclipse应用程序进行的提交,而集群则是远程的虚拟机。因此作业执行时报权限错误。

    这里可以在作业提交过程中指定作业的用户名:conf.setProperty("user.name", "cdhfive")

    ⓑ变量不能解析的错误:这是因为在workflow.xml中定义了一些变量,如${examplesRoot},而在JAVA代码中没有给这些变量赋值(conf.setProperty(key,value))。

    javax.servlet.jsp.el.ELException: variable [examplesRoot] cannot be resolved

    解决:workflow.xml中定义的变量需要在Java代码中使用 conf.setProerty方法指定值。

    整个完整的程序代码参考如下:

    package test;
    
    import java.util.Properties;
    
    import org.apache.oozie.client.OozieClient;
    import org.apache.oozie.client.OozieClientException;
    import org.apache.oozie.client.WorkflowJob.Status;
    
    public class CommitJob {
        public static void main(String[] args) {
            //get a OozieClient for local Oozie
            OozieClient wc = new OozieClient("http://192.168.121.35:11000/oozie");
            
            //create workflow job configuration 
            Properties conf = wc.createConfiguration();
            conf.setProperty(OozieClient.APP_PATH, "hdfs://datanode1:8020/user/cdhfive/examples/apps/map-reduce");
            
            //set a workflow parameters
            conf.setProperty("nameNode", "hdfs://datanode1:8020");
                 
            conf.setProperty("inputDir", "/user/cdhfive/examples/input-data");
    //        conf.setProperty("outputDir", "hdfs://192.168.121.35:8020/user/cdhfive/examples/output-data");
            conf.setProperty("outputDir", "/user/cdhfive/examples/output-data");
            conf.setProperty("queueName", "default");
            conf.setProperty("examplesRoot", "examples");
            conf.setProperty("user.name", "cdhfive");
            
            //submit and start the workflow job
            try{
                String jobId = wc.run(conf);
                System.out.println("Workflow job submitted");
                
                //wait until the workflow job finishes
                while(wc.getJobInfo(jobId).getStatus() == Status.RUNNING){
                    System.out.println("Workflow job running...");
                    try{
                        Thread.sleep(10*1000);
                    }catch(InterruptedException e){e.printStackTrace();}
                }
                System.out.println("Workflow job completed!");
                System.out.println(wc.getJobId(jobId));
            }catch(OozieClientException e){e.printStackTrace();}
            
        }
    }

    运行结果截图:

    3,Oozie处理错误的方式

    If the failure is of transient nature, Oozie will perform retries after a pre-defined time interval. The number of retries and timer interval for a type of action must be pre-configured at Oozie level. Workflow jobs can override such configuration.

    Examples of a transient failures are network problems or a remote system temporary unavailable.

    If the failure is of non-transient nature, Oozie will suspend the workflow job until an manual or programmatic intervention resumes the workflow job and the action start or end is retried.

    如果作业是临时失败的,如因为网络原因或远程系统临时不可用,此时OOzie将会以预定的时间间隔重启作业。若作业不是临时失败的,Oozie将会挂起作业,此时需要手工或程序的干预才能恢复作业的运行。

  • 相关阅读:
    一些端口
    outlook 的微软手册
    目录摘要
    L2TP的包过滤规则
    outlook 的外出时助理程序对外部邮箱不起作用。1个解决办法和另外一个可能性
    用editplus 正则表达式修改联系人表
    Cisco NAT的理解。
    outlook 2003 无法记住密码
    ERD commander 2005的下载地址。
    outlook 2003启用日志记录排除故障。
  • 原文地址:https://www.cnblogs.com/hapjin/p/4874841.html
Copyright © 2011-2022 走看看