zoukankan      html  css  js  c++  java
  • DataStage通过分析日志获取Job插入目标表的记录数

    DataStage通过分析日志获取Job插入目标表的记录数

    这只是一种不太好的方法,也许还有更好、更简便的方法。这种方法要求每次运行Job之前删除已有的日志信息,否则无法统计出正确的记录数。当然,在Job跑完之后,可以在shell备份本次Job运行的日志到服务器磁盘。

    1       日志清理设置

    登录Datastage Administrator,选择对应项目,项目属性->记录,勾选“自动清除作业日志”,设置为自动清理上次及以前的日志。

                           

    Figure 1 Administrator日志清理设置

    2       日志处理

    2.1     日志备份(dsjob -logsum)

    在shell里用dsjob调起来Job,Job运行之后,将本次Job的日志备份到磁盘。

    $DSHOME/bin/dsjob -logsum $projectName $jobName > $sysLogDir/$jobName.txt

    2.2     记录数分析(grep、awk)

    注意到往目标表里写记录的时候会有关键字“Number of rows inserted:”或“Number of rows rejected:”,后面跟记录的数目。考虑到可能有多个节点,因此可以将各个节点的数据都加起来。

    #inserted rows

    insertedRows=`cat $sysLogDir/$jobName.txt | grep "Number of rows inserted:" | awk -F: '{print $3}' | sed 's/,/''/' | awk '{sum=sum+$1;} END {print sum}'`

    #rejected rows

    rejectedRows=`cat $sysLogDir/$jobName.txt | grep "Number of rows rejected:" | awk -F: '{print $3}' | sed 's/,/''/' | awk '{sum1=sum1+$1;} END {print sum1}'`

             考虑到Job可能没进数就终止了,需要在接下来做处理。例如为空的话赋值0。

    if [ ! -n "$insertedRows" ]; then

             insertedRows=0

    fi

    2.3     其他监控信息

    监控可能还要些Job运行起止时间、结束状态等等,可以一并加上,然后写入一个日志文件。

    其中起止时间可以在Job运行前后记录,Job的运行装可以用dsjob –run –status获取。

    jobsta=$($DSHOME/bin/dsjob -run -mode NORMAL  $jobParameters  -warn 0  -jobstatus  $projectName $jobName  2>&1  | awk -F= '/^Status code/{print $2}')

    2.4     Job日志记录

    Job执行完毕之后,将该Job的一些监控信息写入日志。

    echo $projectName $jobName $jobsta `date +%Y-%m-%d" "%H:%M:%S` $startTime $insertedRows $rejectedRows >> $logdir/job_run_` date +%Y%m%d`.log

    3       其他(监控相关)

    当所有的Job都执行完之后,可以建立一个Job,将记录在$logdir/job_run_` date +%Y%m%d`.log中的数据抽取到一个表,用于查看。

    3.1     表设计

    //日志表

    create table DSLog

    (

    id INTEGER NOT NULL  GENERATED BY DEFAULT

        AS IDENTITY (START WITH 1, INCREMENT BY 1) primary key ,

    prjName varchar (20),

    jobName varchar (50),

    state varchar (20),

    rDate date ,

    startTime time,

    endTime time,

    insertedRows integer,

    rejectedRows integer

    )

    //日志状态表

    create table DSLogState

    (

    state varchar (20),

    mark varchar (50 ),

    des varchar (500)

    )

    3.2     Job状态代码

    http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/topic/com.ibm.swg.im.iis.ds.cliapi.ref.doc/topics/r_dsvjbref_Error_Codes.html

    https://www-304.ibm.com/support/docview.wss?uid=swg21469644

    3.3     runJob.sh源码

    #!/bin/bash

    ########################################

    #

    # runJob.sh 2012-08-19

    # run a job with parameters

    #

    #######################################

    # if the number of input parameters is less than 2,then output the help document and exit

    if [ $# -lt 2 ] ; then

    cat << HELP

    runJob --run a job UASGE: runJob projectName jobName jobParameters

    EXAMPLE: runJob dsstage1 DD_Test '-param endDT=20120819'

    HELP

             exit 0

    fi

    projectName="$1"

    jobName="$2"

    jobParameters="$3"

    #echo $projectName

    #echo $jobName

    echo $jobParameters

    #exit 0

    logdir=/DS/DSLogs  #directory to store logs

    workdate=`date +%Y%m%d`

    sysLogDir=/DS/DSLogs/sysLogsBK/`date +%Y%m%d`    #directory to back everyday datastage log.Datastage Administrator is setted to delete the logs before run a job.

    #solve the problem of $DSHOME is null

    source /mistel/IBM/InformationServer/Server/DSEngine/dsenv

    #logdir processing.If log folder not exists,create folder.

    if [ -d $logdir ]; then

             echo "$logdir is exist,continue..."

    else

             echo "$logdir is not exist,creating $logdir..."

             mkdir -p $logdir

    fi

    #create datastage logs backup direcotry

    if [ ! -d $sysLogDir ]; then

             mkdir -p $sysLogDir

    fi

    #job state processing.If job state is not finished ok,then reset the job

    jobsta=$($DSHOME/bin/dsjob -jobinfo $projectName $jobName 2>&1 | awk -F: '/^Job Status/{print $2}')

    echo 'last status: ' $jobsta

    if [ "$jobsta" == " RUN FAILED (3)" -o "$jobsta" == " STOPPED (97)" ];then

             echo "Reset before run job $jobname"

             $DSHOME/bin/dsjob -run -mode RESET  $projectName $jobName   >>${logdir}/job_init_` date +%Y%m%d`.log

             sleep 5

    fi

    #job start run time

    startTime=`date +%H:%M:%S`

    #run a job

    jobsta=$($DSHOME/bin/dsjob -run -mode NORMAL  $jobParameters  -warn 0  -jobstatus  $projectName $jobName  2>&1  | awk -F= '/^Status code/{print $2}')

    #backup datastage logs

    $DSHOME/bin/dsjob -logsum $projectName $jobName > $sysLogDir/$jobName.txt

    #calculate the inserted rows and rejected rows from the back up log file

    #inserted rows

    insertedRows=`cat $sysLogDir/$jobName.txt | grep "Number of rows inserted:" | awk -F: '{print $3}' | sed 's/,/''/' | awk '{sum=sum+$1;} END {print sum}'`

    #rejected rows

    rejectedRows=`cat $sysLogDir/$jobName.txt | grep "Number of rows rejected:" | awk -F: '{print $3}' | sed 's/,/''/' | awk '{sum1=sum1+$1;} END {print sum1}'`

    if [ ! -n "$insertedRows" ]; then

             insertedRows=0

    fi

    if [ ! -n "$rejectedRows" ]; then

             rejectedRows=0

    fi

    echo 'this run status code [1:Finished;2:Finished (see log);3:Aborted;97:Stopped] : ' $jobsta

    #log

    echo $projectName $jobName $jobsta `date +%Y-%m-%d" "%H:%M:%S` $startTime $insertedRows $rejectedRows >> $logdir/job_run_` date +%Y%m%d`.log

  • 相关阅读:
    bootstrap精简教程
    mvc中EditorFor TextBoxFor什么区别
    jQueryEasyUI DateBox的基本使用
    visual studio 2012如何彻底删除TFS上的团队项目
    清除TFS版本控制信息
    在Vs2012 中使用SQL Server 2012 Express LocalDB打开Sqlserver2012数据库
    display & visibility区别
    SQL localdb 连接字符串
    cpio备份命令
    tar备份工具
  • 原文地址:https://www.cnblogs.com/BlueBreeze/p/2804867.html
Copyright © 2011-2022 走看看