zoukankan      html  css  js  c++  java
  • 记一次sqoop安装后测试的问题

    运行命令:

    sqoop import --connect "jdbc:mysql://x.x.x.x:3306/intelligent_qa_bms?useUnicode=true&characterEncoding=utf-8&zeroDateTimeBehavior=convertToNull"  --username root  --password xxxx  --query "select id,siteName,type,section,title,content,url,word_count,status,publishDate,crawlDate from crawl_znwd_all where  updateDate <= 20190717 and word_count <= 800 AND status=2 and $CONDITIONS;" -m 1 --null-string 'null' --null-non-string 'null' --fields-terminated-by '¥' --lines-terminated-by ' ' --hive-drop-import-delims --target-dir /znwd/input --as-textfile --delete-target-dir

    报错一:

    mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
    19/07/18 10:54:16 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
    19/07/18 10:54:17 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
    19/07/18 10:54:18 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
    19/07/18 10:54:19 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
    19/07/18 10:54:20 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

    解决办法:

    在mapred-site.xml添加配置:

    <property>
    <name>mapreduce.jobhistory.address</name>
    <value>HDFSNameNode:10020</value>
    </property>

    报错二:

    9/07/18 11:04:51 INFO mapreduce.Job: map 0% reduce 100%
    19/07/18 11:04:51 INFO mapreduce.Job: Job job_1562900696500_0017 failed with state FAILED due to:
    19/07/18 11:04:51 INFO mapreduce.ImportJobBase: The MapReduce job has already been retired. Performance
    19/07/18 11:04:51 INFO mapreduce.ImportJobBase: counters are unavailable. To get this information,
    19/07/18 11:04:51 INFO mapreduce.ImportJobBase: you will need to enable the completed job store on
    19/07/18 11:04:51 INFO mapreduce.ImportJobBase: the jobtracker with:
    19/07/18 11:04:51 INFO mapreduce.ImportJobBase: mapreduce.jobtracker.persist.jobstatus.active = true
    19/07/18 11:04:51 INFO mapreduce.ImportJobBase: mapreduce.jobtracker.persist.jobstatus.hours = 1
    19/07/18 11:04:51 INFO mapreduce.ImportJobBase: A jobtracker restart is required for these settings
    19/07/18 11:04:51 INFO mapreduce.ImportJobBase: to take effect.
    19/07/18 11:04:51 ERROR tool.ImportTool: Import failed: Import job failed!

    解决办法:

    权限问题,对hdfs目录没有权限,切换到指定目录有权限的用户即可解决;

    同时在mapred-site.xml添加配置(先加的此配置,并不能解决问题,切换到指定用户后解决,所以应该就是权限的问题,此配置也可以不加):

    <property>
    <name>mapreduce.jobtracker.persist.jobstatus.active</name>
    <value>true</value>
    </property>
    <property>
    <name>mapreduce.jobtracker.persist.jobstatus.hours</name>
    <value>1</value>
    </property>

    问题三:

    19/07/18 11:15:10 INFO mapreduce.Job: Task Id : attempt_1562900696500_0019_m_000000_0, Status : FAILED
    Container [pid=89807,containerID=container_1562900696500_0019_01_000002] is running beyond virtual memory limits. Current usage: 549.3 MB of 1 GB physical memory used; 4.0 GB of 2.1 GB virtual memory used. Killing container.
    Dump of the process-tree for container_1562900696500_0019_01_000002 :
    |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
    |- 89807 89804 89807 89807 (bash) 4 3 115920896 370 /bin/bash -c /home/admin/jdk1.8.0_144/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2048M -Djava.io.tmpdir=/admin_data/hadoop_tmp/nm-local-dir/usercache/zhaolei/appcache/application_1562900696500_0019/container_1562900696500_0019_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/admin/hadoop-2.7.3/logs/userlogs/application_1562900696500_0019/container_1562900696500_0019_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.10.15 36756 attempt_1562900696500_0019_m_000000_0 2 1>/home/admin/hadoop-2.7.3/logs/userlogs/application_1562900696500_0019/container_1562900696500_0019_01_000002/stdout 2>/home/admin/hadoop-2.7.3/logs/userlogs/application_1562900696500_0019/container_1562900696500_0019_01_000002/stderr
    |- 90025 89807 89807 89807 (java) 1477 86 4182704128 140259 /home/admin/jdk1.8.0_144/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2048M -Djava.io.tmpdir=/admin_data/hadoop_tmp/nm-local-dir/usercache/zhaolei/appcache/application_1562900696500_0019/container_1562900696500_0019_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/admin/hadoop-2.7.3/logs/userlogs/application_1562900696500_0019/container_1562900696500_0019_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.10.15 36756 attempt_1562900696500_0019_m_000000_0 2

    Container killed on request. Exit code is 143
    Container exited with a non-zero exit code 143

    解决办法:

    此问题并不影响执行结果,但是如果想追求完美的话依然需要解决:

    在yarn-site.xml中添加配置:

    <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>300000</value>
    </property>
    <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>30000</value>
    </property>
    <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>3000</value>
    </property>
    <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>2000</value>
    </property>

    具体参数解释可参考:https://www.cnblogs.com/xjh713/p/9681442.html

  • 相关阅读:
    css3背景色过渡
    HttpUtility.UrlEncode与Server.UrlEncode()转码区别
    js 中编码(encode)和解码(decode)的三种方法
    jQuery 页面加载初始化
    oracle 索引失效原因_汇总
    jdbc连接数据库使用sid和service_name的区别
    作为首席架构师,我是如何选择并落地架构方案的?
    (二)、JAVA运行时数据区域
    (一)、Java内存模型
    Java中Volatile关键字详解
  • 原文地址:https://www.cnblogs.com/tianziru/p/11206387.html
Copyright © 2011-2022 走看看