错误详情
Error: java.io.IOException: SQLException in nextKeyValue
at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:275)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:568)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: java.sql.SQLException: Value '省略...0000-00-00 00:00:000000-00-00 ...省略' can not be represented as java.sql.Timestamp
首先 not be represented as java.sql.Timestamp 前面打印的的value值的是错误的,打印的value是把所有列的值做了个拼接(可能与binlog有关),然后提示不是timestamp类型,这种情况只出现在cdh版的sqoop中,说明cdh是某些细节跟开源版不一样。
原因
value中有个0000-00-00
,查看mysql表确实有时间类型的列存在“0000-00-00”的数据,而0000-00-00
这种非标准SQL时间格式的数据明显是不能转换成Timestamp,所以cdh版sqoop抛出了异常,只是这里的异常打印信息不够准确。关于zerodatetimebehavior的处理有三种
- exception(默认):抛出异常
- convertToNull:转化为null
- round:用最近的日期代替如1970-01-01
cdh版sqoop对“0000-00-00”类型的时间数据明显采取的是默认exception策略,而Apache sqoop官网明确表明将会把0000-00-00
的date值默认转成null。
解决方案
–connect jdbc后面手动加上?zerodatetimebehavior=converttonull
[WARN] 本文发自csd n禁止抓取转载