zoukankan      html  css  js  c++  java
  • Oracle sqlldr导入之“MAXIMUM ERROR COUNT EXCEEDED”

        昨天看到一个同事在通过PL/SQL Developer工具把文本数据往oracle表;有两个文本;一个有30万条记录;一个7万多条记录。
    在导入到过程中;出现错误记录还需要点击确认。不过使用黑科技(屏幕精灵)自动点击。单一个7万多记录的文本需要10分钟左右的时间。

        看到这种情况;我热心肠爆发;我帮你来提快导入速度。我想到两种方案;
        1:oracle的sqlldr命令
        2:oracle的外部表;
        由于文本存在错误记录;就选择1.
        
    文本格式如下:存在有76760条记录

    [oracle@oracle234 ~]$ wc -l lottu.txt               
    76761 lottu.txt
    [oracle@oracle234 ~]$ head lottu.txt
    stat_user_stay_info.rowkey,stat_user_stay_info.appkey,stat_user_stay_info.phone_softversion,stat_user_stay_info.dim_type,stat_user_stay_info.dim_code,stat_user_stay_info.time_peroid,stat_user_stay_info.stat_date,stat_user_stay_info.indicator,stat_user_stay_info.stat_time,stat_user_stay_info.value
    3a00997_7c34d20170108,307A5C626E6C2F6472636E6E6A2F736460656473,2.14.0,cpid,blf1298_12243_001,1,20170105,stay3day,20170109102339,1
    3a00997_bf86b20170108,307A5C626E6C2F6472636E6E6A2F736460656473,2.13.0,cpid,blp1375_13621_001,1,20170105,stay3day,20170109102339,7
    3a00e87_4b11a20170126,337A5C626E6C2F6472636E6E6A2F736460656473,1.4.0,cpid,all,1,20170123,stay3day,20170127095931,6
    3a0129a_6575220170118,307A5C626E6C2F6460726E742F716D7472,all,cpid,bsf1389_10917_001,1,20170116,stay2day,20170119094145,1
    3a0183b_5764a20170202,307A5C626E6C2F6472636E6E6A2F736460656473,1.91,cpid,blf1298_12523_001,1,20170128,stay5day,20170203094327,1
    3a01b9b_54b4720170123,307A5C626E6C2F6472636E6E6A2F736460656473,2.13.0,cpid,blp1375_13641_001,1,20170122,stay1day,20170124102457,3
    3a0230d_7464120170126,307A5C626E6C2F6460726E742F606F65736E686569646D716473,all,cpid,bsp1405_13363_001,1,20170122,stay4day,20170127100446,18
    3a02bed_3ea3320170206,307A5C626E6C2F6472636E6E6A2F736460656473,2.15.0,cpid,blp1375_14217_001,1,20170130,stay7day,20170207135438,1
    3a03fe3_4c5fe20170119,307A5C21626E6C2F6472776865646E21,all,cpid,bvf1328_10885_001,1,20170116,stay3day,20170120093733,1   

    导入表结果如下:

    SQL> desc STAT_USER_STAY_INFO1;
     Name                                      Null?    Type
     ----------------------------------------- -------- ----------------------------
     JOBID                                              VARCHAR2(64)
     APPKEY                                    NOT NULL VARCHAR2(200)
     PHONE_SOFTVERSION                         NOT NULL VARCHAR2(32)
     DIM_TYPE                                  NOT NULL VARCHAR2(64)
     DIM_CODE                                  NOT NULL VARCHAR2(64)
     TIME_PEROID                               NOT NULL VARCHAR2(4)
     STAT_DATE                                 NOT NULL VARCHAR2(500)
     INDICATOR                                 NOT NULL VARCHAR2(200)
     STAT_TIME                                          VARCHAR2(500)
     VALUE                                     NOT NULL NUMBER

    执行sqlldr命令;但结果呢?只导入5万条记录;结果出乎意料。

    sqlldr 'lottu/li0924' control=/home/oracle/stay_info.ctl log=/home/oracle/stay_info.log bad=/home/oracle/stay_info.bad

    查看log文件;由于篇幅的问题;只取关键部分。

    .......
    Record 55076: Rejected - Error on table STAT_USER_STAY_INFO1, column DIM_CODE.
    ORA-01400: cannot insert NULL into ("LOTTU"."STAT_USER_STAY_INFO1"."DIM_CODE")
    
    MAXIMUM ERROR COUNT EXCEEDED - Above statistics reflect partial run.
    
    Table STAT_USER_STAY_INFO1:
      55025 Rows successfully loaded.        
      51 Rows not loaded due to data errors.
      0 Rows not loaded because all WHEN clauses were failed.
      0 Rows not loaded because all fields were null.
    
    
    Space allocated for bind array:                 165120 bytes(64 rows)
    Read   buffer bytes: 1048576
    
    Total logical records skipped:          0
    Total logical records read:         55105
    Total logical records rejected:        51
    Total logical records discarded:        0
    
    Run began on Fri Feb 24 10:51:02 2017
    Run ended on Fri Feb 24 10:51:09 2017
    
    Elapsed time was:     00:00:06.87
    CPU time was:         00:00:00.46  

      日志提示;只导入了“55025 Rows successfully loaded. ”;加上拒绝的51条;这跟76761条记录远远不够。
    日志记录是不会骗人的;进入表中查看确实是55025条记录

    SQL> select count(*) from STAT_USER_STAY_INFO1;
    
      COUNT(*)
    ----------
         55025

      奇了怪;明明76761条记录;为什么oracle只认可5万条记录 。还有2万多条记录为什么不认可呢?
    其实oracle给了提示;就在刚刚那个log文件中。只是我忽略这样一句话"MAXIMUM ERROR COUNT EXCEEDED - Above statistics reflect partial run."。
    这个意思是说;超过最大所容限错误数。
    既然说得这么清楚了;我们在看看sqlldr命令

    [oracle@oracle234 ~]$ sqlldr
    
    SQL*Loader: Release 11.2.0.1.0 - Production on Fri Feb 24 11:00:08 2017
    
    Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
    
    
    Usage: SQLLDR keyword=value [,keyword=value,...]
    
    Valid Keywords:
    
        userid -- ORACLE username/password           
       control -- control file name                  
           log -- log file name                      
           bad -- bad file name                      
          data -- data file name                     
       discard -- discard file name                  
    discardmax -- number of discards to allow          (Default all)
          skip -- number of logical records to skip    (Default 0)
          load -- number of logical records to load    (Default all)
        errors -- number of errors to allow            (Default 50)
          rows -- number of rows in conventional path bind array or between direct path data saves
                   (Default: Conventional path 64, Direct path all)
      bindsize -- size of conventional path bind array in bytes  (Default 256000)
        silent -- suppress messages during run (header,feedback,errors,discards,partitions)
        direct -- use direct path                      (Default FALSE)
       parfile -- parameter file: name of file that contains parameter specifications
      parallel -- do parallel load                     (Default FALSE)
          file -- file to allocate extents from      
    skip_unusable_indexes -- disallow/allow unusable indexes or index partitions  (Default FALSE)
    skip_index_maintenance -- do not maintain indexes, mark affected indexes as unusable  (Default FALSE)
    commit_discontinued -- commit loaded rows when load is discontinued  (Default FALSE)
      readsize -- size of read buffer                  (Default 1048576)
    external_table -- use external table for load; NOT_USED, GENERATE_ONLY, EXECUTE  (Default NOT_USED)
    columnarrayrows -- number of rows for direct path column array  (Default 5000)
    streamsize -- size of direct path stream buffer in bytes  (Default 256000)
    multithreading -- use multithreading in direct path  
     resumable -- enable or disable resumable for current session  (Default FALSE)
    resumable_name -- text string to help identify resumable statement
    resumable_timeout -- wait time (in seconds) for RESUMABLE  (Default 7200)
    date_cache -- size (in entries) of date conversion cache  (Default 1000)
    no_index_errors -- abort load on any index errors  (Default FALSE)
    
    PLEASE NOTE: Command-line parameters may be specified either by
    position or by keywords.  An example of the former case is 'sqlldr
    scott/tiger foo'; an example of the latter is 'sqlldr control=foo
    userid=scott/tiger'.  One may specify parameters by position before
    but not after parameters specified by keywords.  For example,
    'sqlldr scott/tiger control=foo logfile=log' is allowed, but
    'sqlldr scott/tiger control=foo log' is not, even though the
    position of the parameter 'log' is correct. 

    其中有一行内容“    errors -- number of errors to allow            (Default 50)”  
    所以说上面出现问题就不奇怪了。

    最后将sqlldr命令加errors参数即可。

    sqlldr 'lottu/li0924' control=/home/oracle/stay_info.ctl log=/home/oracle/stay_info.log bad=/home/oracle/stay_info.bad errors=1000

    整个过程20秒处理完成。相比之下;我那同事果断选择这种方法。

  • 相关阅读:
    数据机构与算法学习(四)- 链表
    DFS深度优先
    LeetCode.98验证二叉树
    输入一个有符号整数,输出该整数的反转值。
    如何交换两个对象
    泛型简介,泛型类及使用
    一个普通的逻辑问题
    for循环
    第一次比赛唯一ACCEPT的题目笑哭
    输入100以内具有10个以上因子的整数 并输出它的因子
  • 原文地址:https://www.cnblogs.com/lottu/p/6437874.html
Copyright © 2011-2022 走看看