zoukankan      html  css  js  c++  java
  • sqoop从DB2迁移数据到HDFS

    Sqoop import job failed to read data from DB2 database which has UTF8 encoding. Essentially, even the data cannot be read at DB2 with select queries as there are some characters which are not in UTF8.

    Sqoop job will throw an error similar to below:

    Error: java.io.IOException: SQLException in nextKeyValue
            at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:265)
    ..
    ..
    Caused by: com.ibm.db2.jcc.am.SqlException: [jcc][t4][1065][12306][4.19.26] Caught java.io.CharConversionException.  See attached Throwable for details. ERRORCODE=-4220, SQLSTATE=null
            at com.ibm.db2.jcc.am.kd.a(Unknown Source)
            at com.ibm.db2.jcc.am.kd.a(Unknown Source)
    ..
    ..
    Caused by: java.nio.charset.MalformedInputException: Input length = 527
            at com.ibm.db2.jcc.am.s.a(Unknown Source)
            ... 22 more
    Caused by: sun.io.MalformedInputException
            at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java:105)
            ... 23 more
    2018-09-10 06:01:34,879 INFO mapreduce.Job:  map 0% reduce 0%
    2018-09-10 06:01:45,942 INFO mapreduce.Job:  map 100% reduce 0%
    2018-09-10 06:02:02,039 INFO mapreduce.Job: Task Id : attempt_1535965915754_0038_m_000000_2, Status : FAILED
    Error: java.io.IOException: SQLException in nextKeyValue
            at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
            at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
            at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
            at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
            at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
            at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
            at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
            at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
            at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:415)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1988)
            at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
    Caused by: com.ibm.db2.jcc.am.SqlException: [jcc][t4][1065][12306][4.16.53] Caught java.io.CharConversionException.  See attached Throwable for details. ERRORCODE=-4220, SQLSTATE=null
            at com.ibm.db2.jcc.am.fd.a(fd.java:723)
            at com.ibm.db2.jcc.am.fd.a(fd.java:60)
            at com.ibm.db2.jcc.am.fd.a(fd.java:112)
            at com.ibm.db2.jcc.am.jc.a(jc.java:2870)
            at com.ibm.db2.jcc.am.jc.p(jc.java:527)
            at com.ibm.db2.jcc.am.jc.N(jc.java:1563)
            at com.ibm.db2.jcc.am.ResultSet.getStringX(ResultSet.java:1153)
            at com.ibm.db2.jcc.am.ResultSet.getString(ResultSet.java:1128)
            at org.apache.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:71)
            at com.cloudera.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:61)
            at PC_KPI_PC_INCIDENT_CFIUS_CONSTRAINED.readFields(PC_KPI_PC_INCIDENT_CFIUS_CONSTRAINED.java:197)
            at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:244)
            ... 12 more
    Caused by: java.nio.charset.MalformedInputException: Input length = 574820
            at com.ibm.db2.jcc.am.r.a(r.java:19)
            at com.ibm.db2.jcc.am.jc.a(jc.java:2862)
            ... 20 more
    Caused by: sun.io.MalformedInputException
            at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java:167)
            at com.ibm.db2.jcc.am.r.a(r.java:16)
            ... 21 more

     

    解决办法:

    需要在yarn的mapred-site.xml文件中添加如下配置:

    <property>
    <name>mapreduce.map.java.opts</name>
    <value>-Xmx1024m -Ddb2.jcc.charsetDecoderEncoder=3</value>
    </property>

    http://www-01.ibm.com/support/docview.wss?uid=swg21684365

  • 相关阅读:
    Python—Socket
    python-—计算器
    Python—I-O多路复用
    Python—redis
    《Python数据分析常用手册》一、NumPy和Pandas篇
    python--Selenium-模拟浏览器
    python--selenium简单模拟百度搜索点击器
    关于selenium实现滑块验证
    python 读写、创建 文件的方法(必看)
    Python 爬虫的工具列表大全
  • 原文地址:https://www.cnblogs.com/songyuejie/p/9643911.html
Copyright © 2011-2022 走看看