问题场景
今天有个需求,将 hive 跑完的数据入到 Oracle 里,需求侧直接从 Oracle 表里取数据,不接收文件;
通过下面脚本落到本地文件:
[hive@hadoop101 tool]$ more hive_2_file.sh #!/bin/sh path=$1 query=$2 file=$3 field=$4 beeline -u 'jdbc:hive2://110.110.1.110:9000' -n username -p password -e "insert overwrite directory '/warehouse/servpath/downlo ad/${path}' ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ( 'field.delim'='${field}', 'serialization.format'= '', 'serialization.null.format'='' ) ${query} " hadoop fs -getmerge /warehouse/servpath/downlo/${path} ${file}
文件入库控制文件:
load data characterset utf8 infile '/warehouse/servpath/downlo/TEMP_INFO_20200909.txt' append into table TEMP_INFO_20200909 fields terminated by " " TRAILING NULLCOLS (MONTH_ID, PROV_ID, USER_ID, COUNTRY_STAY_DAYS)
执行入库程序:
#!/bin/sh table=$1 sqlldr scott/tieger@orcl control=/TEMP_INFO_20200909/${table}.ctl log=/TEMP_INFO_20200909/${table}.log bad=TEMP_INFO_20200909/${table}.bad rows=1100000000 direct=true skip_index_maintenance=TRUE
结果居然报错了
#!/bin/shRecord 5031: Rejected - Error on table TEMP_INFO_20200909, column COUNTRY_STAY_DAYS. Field in data file exceeds maximum length
可是 文件里的字段没有超过表里的最大长度,在网上找了一下,说是需要修改控制文件,在报错字段后 + char(300) 即可;
load data characterset utf8 infile '/warehouse/servpath/downlo/TEMP_INFO_20200909.txt' append into table TEMP_INFO_20200909 fields terminated by " " TRAILING NULLCOLS (MONTH_ID, PROV_ID, USER_ID, COUNTRY_STAY_DAYS char(300))
重启启动入库程序,问题得以解决