zoukankan      html  css  js  c++  java
  • Nutch的日志系统 分类: H3_NUTCH 2015-02-17 20:14 261人阅读 评论(0) 收藏


    一、Nutch日志实现方式

    1、Nutch使用slf4j作为日志接口,使用log4j作为具体实现。关于二者的基础,请参考

    http://blog.csdn.net/jediael_lu/article/details/43854571

    http://blog.csdn.net/jediael_lu/article/details/43865571


    2、在java类文件中,通过以下方式输出日志消息:

    (1)获取Logger对象

      public static final Logger LOG = LoggerFactory.getLogger(InjectorJob.class);
    

    (2)使用Logger进行输出

        SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        long start = System.currentTimeMillis();
        LOG.info("InjectorJob: starting at " + sdf.format(start));

    3、在log4j.properties中定义各个属性

    # Define some default values that can be overridden by system properties
    hadoop.log.dir=.
    hadoop.log.file=hadoop.log
    
    # RootLogger - DailyRollingFileAppender
    log4j.rootLogger=INFO,DRFA
    
    # Logging Threshold
    log4j.threshold=ALL
    
    #special logging requirements for some commandline tools
    log4j.logger.org.apache.nutch.crawl.Crawl=INFO,cmdstdout
    log4j.logger.org.apache.nutch.crawl.InjectorJob=INFO,cmdstdout
    log4j.logger.org.apache.nutch.host.HostInjectorJob=INFO,cmdstdout
    log4j.logger.org.apache.nutch.crawl.GeneratorJob=INFO,cmdstdout
    log4j.logger.org.apache.nutch.crawl.DbUpdaterJob=INFO,cmdstdout
    log4j.logger.org.apache.nutch.host.HostDbUpdateJob=INFO,cmdstdout
    log4j.logger.org.apache.nutch.fetcher.FetcherJob=INFO,cmdstdout
    log4j.logger.org.apache.nutch.parse.ParserJob=INFO,cmdstdout
    log4j.logger.org.apache.nutch.indexer.IndexingJob=INFO,cmdstdout
    log4j.logger.org.apache.nutch.indexer.DeleteDuplicates=INFO,cmdstdout
    log4j.logger.org.apache.nutch.indexer.CleaningJob=INFO,cmdstdout
    log4j.logger.org.apache.nutch.crawl.WebTableReader=INFO,cmdstdout
    log4j.logger.org.apache.nutch.host.HostDbReader=INFO,cmdstdout
    log4j.logger.org.apache.nutch.parse.ParserChecker=INFO,cmdstdout
    log4j.logger.org.apache.nutch.indexer.IndexingFiltersChecker=INFO,cmdstdout
    log4j.logger.org.apache.nutch.plugin.PluginRepository=WARN
    log4j.logger.org.apache.nutch.api.NutchServer=INFO,cmdstdout
    
    log4j.logger.org.apache.nutch=INFO
    log4j.logger.org.apache.hadoop=WARN
    log4j.logger.org.apache.zookeeper=WARN
    log4j.logger.org.apache.gora=WARN
    
    #
    # Daily Rolling File Appender
    #
    
    log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
    log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
    
    # Rollver at midnight
    log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
    
    # 30-day backup
    #log4j.appender.DRFA.MaxBackupIndex=30
    log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
    
    # Pattern format: Date LogLevel LoggerName LogMessage
    log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} - %m%n
    # Debugging Pattern format: Date LogLevel LoggerName (FileName:MethodName:LineNo) LogMessage
    #log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
    
    
    #
    # stdout
    # Add *stdout* to rootlogger above if you want to use this 
    #
    
    log4j.appender.stdout=org.apache.log4j.ConsoleAppender
    log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
    log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
    
    #
    # plain layout used for commandline tools to output to console
    #
    log4j.appender.cmdstdout=org.apache.log4j.ConsoleAppender
    log4j.appender.cmdstdout.layout=org.apache.log4j.PatternLayout
    log4j.appender.cmdstdout.layout.ConversionPattern=%m%n
    
    #
    # Rolling File Appender
    #
    
    #log4j.appender.RFA=org.apache.log4j.RollingFileAppender
    #log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file}
    
    # Logfile size and and 30-day backups
    #log4j.appender.RFA.MaxFileSize=1MB
    #log4j.appender.RFA.MaxBackupIndex=30
    
    #log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
    #log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} - %m%n
    #log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n

    二、Nutch日志分析

    1、nutch日志输出有2个appender: cmdstdout 与 DRFA。

    前者将日志输出至标准输出中,后者将文件输出到每日一个的日志文件中。


    2、整个工程的默认日志设置为INFO, DRFA

    而nutch自身的日志被重定义为INFO,cmdstdout

    hadoop, gora, zookeeper等则重定义为WARN,DRFA, 默认日志为./hadoop.log


    版权声明:本文为博主原创文章,未经博主允许不得转载。

  • 相关阅读:
    关于在MAC上进行 LARAVEL 环境 Homestead 安装过程记录
    js 贷款计算器
    js 实现阶乘
    js 两点间距离函数
    composer Your requirements could not be resolved to an installable set of packages
    vue 项目优化记录 持续更新...
    vue 项目打包
    vue 真机调试页面出现空白
    vue 真机调试
    谈谈-Android状态栏的编辑
  • 原文地址:https://www.cnblogs.com/lujinhong2/p/4637218.html
Copyright © 2011-2022 走看看