zoukankan      html  css  js  c++  java
  • 卸载oracle数据至flat file之二

    http://askanantha.blogspot.com/2009/01/unloading-oracle-data-to-flat-files.html
    Fastreader from WisdomForce,  http://www.wisdomforce.com/

    在我以前的一个博客http://www.cnblogs.com/harrychinese/archive/2011/06/30/Unload_Oracle_data_into_text_file.html中, 提及了sqluldr2 这个 oracle 卸数工具. 这几天在使用sqluldr2过程中, 再次遭遇 sqluldr2错误地将varchar2(100)导出为长度4000的情况,(我用的sqluldr2.exe是2011-3-24日创建的,linux 32/64是2011-3-25日创建的). 为了解决这个问题, google了很久, 又有新的发现:

    0. 新发现的卸数工具
       *  http://askanantha.blogspot.com/2009/01/unloading-oracle-data-to-flat-files.html
       *  Fastreader from WisdomForce,  http://www.wisdomforce.com/
    1. 如何打印出sqluldr2.exe的全部选项
      sqluldr2.exe help=yes
    2.sqluldr2 和 ociuldr 和 ociuldr2, tbuldr 的关系
      2.1 ociuldr 是Fangxin Lou采用的是OCI 7接口写的, ociuldr 源码下载地址 http://www.anysql.net/software/ociuldr.c
      2.2 ociuldr2 是hrb_qiuyb根据ociuldr源代码, 用OCI 8的函数接口重写了文本导出工具, 所有已有的功能和命令行参数保持不变, 并增加了对CLOB/BLOB字段的支持(导出成独立的文件), 并且在Oracle 10g下可以用sys用户登录进行操作了. 详细信息见 http://www.anysql.net/tools/ociuldr2_source_code.html , ociuldr2 源码下载地址 http://www.anysql.net/software/ociuldr2.c
      2.3 sqluldr2 是Fangxin Lou采用OCI 8实现了sqluldr2, 该程序没有开放源码, 不过有Windows/Linux 64/Linux 32/AIX的版本供免费下载使用.  程序下载地址: http://www.anysql.net/software/sqluldr.zip
      2.4 tbuldr(ToolBox*UnLoader) 是 NinGoo 根据 ociuldr2 源码重新做了实现, 并且实现了部分sqludr2的功能, 详细信息见http://www.ningoo.net/html/2009/learn_oci_programming_from_ociuldr.html. 源码和编译后的程序下载地址为:  http://www.ningoo.net/software/tbuldr.zip
      2.5. 网上还有一个OCIExtract, 经查看源码, 这个OCIExtract.c和tbuldr.c完全一样.
     
    3. 到底该选择sqluldr2 还是 ociuldr 还是 ociuldr2?
       3.1. 按照Fangxin Lou讲: 如果对性能没有极端要求, 没有必要换到 sqluldr2. 有些数据仓库的应用, 或是抽取数据给搜索引挚, 或是用文本方式来归档保存巨量数据的情况下, 则可以考虑升级到 sqluldr2 来导出文本.
       3.2. ITPUB上有网友做个测试报告称, sqluldr2 比 ociuldr快70%
       3.3 关于hrb_qiuyb版 ociuldr2, tbuldr(ToolBox*UnLoader)作者讲,hrb_qiuyb版的bug比较多,动不动就segment fault
       3.4 tbuldr(ToolBox*UnLoader)作者, 这样评价他的作品: 第一次用c和oci正儿八经的写东西,bug肯定一大堆。山寨有风险,慎用. 我自己测试了该程序, 总是报ORA-12154.
       3.5. 我在使用sqluldr2过程中, 碰到了sqluldr2 错误地将varchar2(100)导出为长度4000的情况, 但我使用 ociuldr 却没有发现同样的问题, 应该是 sqluldr2 引入的新bug. 也许可以通过给 sqluldr2 加上width 参数来避免这个问题, 但对于大表的话, 给每个输出字段都加上width是很麻烦的, 有空的时候, 可以编写一个小工具, 辅助生成width option.
         
      <<首先sqluldr2 还是 ociuldr 还是 ociuldr2都不支持 nvarchar2/nchar类型(因为程序中大量使用了基于byte的数组, 同时输出的文件时ansi格式的, 所以肯定不支持nchar类型). 对于多数情况, 我的建议是采用ociuldr; 如果性能要求特别高, 应采用用sqluldr2, 但必须加上width参数, 以免字段长度在导出时有误. >>
     
    4. 如何使用ociuldr
        网上可以下载ociuldr.c, 但是找不到编译好的执行程序, 如果要在RHEL 64bit上运行, 可以使用gcc编译.

        编译的方法是:
        gcc -m64 -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -I${ORACLE_HOME}/rdbms/public -I${ORACLE_HOME}/rdbms/demo -L${ORACLE_HOME}/lib -lclntsh -o ociuldr.bin ociuldr.c
         
        运行程序的方法是:
        ./ociuldr.bin 参数...

    5. 如何导入 ociuldr 或 sqluldr2 生成的文件
        如果我们是文本文件是定长格式, 在导入时候, 尤其需要注意各种类型字段的导出长度, 我专门设计了一个测试表TYPE_WIDTH_TEST, 来确定ociuldr或sqluldr2导出各个类型字段的长度. 以下是测试结果: (注ociuldr_h是ociuldr做了简单修正的版本, 绝大多数的字段处理都一样, 只是修正了number和TIMESTAMP TZ的输出长度)

    ====begin of ociuldr和sqluldr2和ociuldr_h的字段输出长度=======
    DATE --ociuldr 长度为19,--sqluldr2 长度为19,格式为YYYY-MM-DD HH24:MI:SS
    TIMESTAMP(6) --ociuldr 长度为26,--sqluldr2 长度为26,格式为YYYY-MM-DD HH24:MI:SSXFF
    TIMESTAMP(6) WITH LOCAL TIME ZONE --ociuldr 长度为75,--sqluldr2长度为33,-ociuldr_h 修正长度为33
    TIMESTAMP(6) WITH TIME ZONE  --ociuldr 长度为33,--sqluldr2 长度为33,--ociuldr_h 修正长度为33,
    NUMBER --ociuldr 长度为40,--sqluldr2 长度为46, --ociuldr_h 修正长度为46,
    NUMBER(38) --ociuldr 长度为precision,--sqluldr2 长度为precision+2,--ociuldr_h 修正长度为precision+2,
    NUMBER(25,5)--ociuldr 长度为precision,--sqluldr2 长度为precision+2,--ociuldr_h 修正长度为precision+2,
    NUMBER(20)--ociuldr 长度为precision,--sqluldr2 长度为precision+2,--ociuldr_h 修正长度为precision+2,
    VARCHAR2(3900)--ociuldr 长度为length,--sqluldr2长度为length,
    VARCHAR2(40)--ociuldr 长度为length,--sqluldr2长度为length,
    NVARCHAR2(200)--ociuldr 长度没有规律,--sqluldr2 长度没有规律,
    NVARCHAR2(16)--ociuldr 长度没有规律,--sqluldr2 长度没有规律,
    CHAR(10)--ociuldr 长度为length,--sqluldr2 长度为length,
    CHAR(500)--ociuldr 长度为length,--sqluldr2 长度为length,
    INTERVAL YEAR(2) TO MONTH--ociuldr 长度为75,--sqluldr2 长度为20,
    INTERVAL DAY(2) TO SECOND(6)--ociuldr 长度为75,--sqluldr2 长度为30,
    ROWID, --ociuldr 长度为18,--sqluldr2 长度为18,
    FLOAT --ociuldr 长度为128,--sqluldr2 长度为129,
    FLOAT(22)--ociuldr 长度为precision+2,--sqluldr2 长度为precision+3,
    ====end of ociuldr和sqluldr2和ociuldr_h的字段输出长度=======

    从上面的结果可以看出, sqluldr2在处理float类型不太对劲. ociuldr在处理number类型的输出长度时候, 不太正确, 没有考虑负号和小数点的问题. 我做了简单的修正, 取名ociuldr_h, 结果和sqluldr2保持一致.需要注意的是, 修正后的ociuldr程序和sqluldr2在处理NTERVAL YEAR(2) TO MONTH和 INTERVAL DAY(2) TO SECOND(6)时, 输出的长度仍不一致, 好在这两种类型不常用. 
       

    ociuldr程序是通过oci的odefin()函数来获取字段的类型, 进而控制各种字段在输出时的长度,  odefin()常见的返回的类型值参考网页http://download.oracle.com/docs/cd/B13789_01/appdev.101/b10779/oci03typ.htm, 下面列出了部分返回值
    /* input data types */
    #define SQLT_CHR  1           /* (ORANET TYPE) character string */
    #define SQLT_NUM  2             /* (ORANET TYPE) oracle numeric */
    #define SQLT_INT  3                    /* (ORANET TYPE) integer */
    #define SQLT_FLT  4      /* (ORANET TYPE) Floating point number */
    #define SQLT_STR  5                   /* zero terminated string */
    #define SQLT_VNU  6           /* NUM with preceding length byte */
    #define SQLT_PDN  7     /* (ORANET TYPE) Packed Decimal Numeric */
    #define SQLT_LNG  8                                     /* long */
    #define SQLT_VCS  9                /* Variable character string */
    #define SQLT_NON  10         /* Null/empty PCC Descriptor entry */
    #define SQLT_RID  11                                   /* rowid */
    #define SQLT_DAT  12                   /* date in oracle format */
    #define SQLT_VBI  15                    /* binary in VCS format */
    #define SQLT_BIN  23                     /* binary data(DTYBIN) */
    #define SQLT_LBI  24                             /* long binary */
    #define SQLT_UIN  68                        /* unsigned integer */
    #define SQLT_SLS  91           /* Display sign leading separate */
    #define SQLT_LVC  94                     /* Longer longs (char) */
    #define SQLT_LVB  95                      /* Longer long binary */
    #define SQLT_AFC  96                         /* Ansi fixed char */
    #define SQLT_AVC  97                           /* Ansi Var char */
    #define SQLT_CUR  102                           /* cursor  type */
    #define SQLT_LAB  105                             /* label type */
    #define SQLT_OSL  106                           /* oslabel type */
       
       
    其他阅读条目:
    在豆丁上找到Fangxin Lou写一篇关于oracle 数据如何导出文本, 非常全面.
    http://www.docin.com/p-42836914.html


    附件,TYPE_WIDTH_TEST_DDL-Data.sql, 测试table的DDL和DML语句

    prompt PL/SQL Developer import file
    prompt Created on 2011年8月7日 by Harry
    set feedback off
    set define off
    prompt Creating TYPE_WIDTH_TEST...
    create table TYPE_WIDTH_TEST
    (
      DATE_TYPE_1         DATE
      ,TIMESTAMP_TYPE_1    TIMESTAMP(6)
      ,TIMESTAMP_TYPE_2    TIMESTAMP(6) WITH LOCAL TIME ZONE
      ,TIMESTAMP_TYPE_3    TIMESTAMP(6) WITH TIME ZONE
      ,NUMBERTYPE_1        NUMBER
      ,NUMBERTYPE_2        NUMBER(38)
      ,NUMBERTYP_3         NUMBER(25,5)
      ,NUMBERTYP_4         NUMBER(20)
      ,VARCHAR2TYPE_1      VARCHAR2(3900)
      ,VARCHAR2TYPE_2      VARCHAR2(40)
      ,NVARCHAR2_TYPE_1    NVARCHAR2(200)
      ,NVARCHAR2_TYPE_2    NVARCHAR2(16)
      ,CHARTYPE_1          CHAR(10)
      ,CHARTYPE_2          CHAR(500)
      ,INTERVALYEAR_TYPE_1 INTERVAL YEAR(2) TO MONTH
      ,INTERVALDAY_TYPE_1  INTERVAL DAY(2) TO SECOND(6)
      ,FLOAT_TYPE1         FLOAT
      ,FLOAT_TYPE2         FLOAT(22)
      --BINARY_DOUBLE_TYPE_1 BINARY_DOUBLE, --ORA-03115: unsupported network datatype or representation
      --BINARY_FLOAT_TYPE_1  BINARY_FLOAT,  --ORA-03115: unsupported network datatype or representation 
    )
    tablespace SDB_DATA
      pctfree 10
      initrans 1
      maxtrans 255
      storage
      (
        initial 64K
        minextents 1
        maxextents unlimited
      );

    prompt Disabling triggers for TYPE_WIDTH_TEST...
    alter table TYPE_WIDTH_TEST disable all triggers;
    prompt Deleting TYPE_WIDTH_TEST...
    delete from TYPE_WIDTH_TEST;
    commit;
    prompt Loading TYPE_WIDTH_TEST...
    insert into TYPE_WIDTH_TEST (DATE_TYPE_1, TIMESTAMP_TYPE_1, TIMESTAMP_TYPE_2, TIMESTAMP_TYPE_3, NUMBERTYPE_1, NUMBERTYPE_2, NUMBERTYP_3, NUMBERTYP_4, VARCHAR2TYPE_1, VARCHAR2TYPE_2, NVARCHAR2_TYPE_1, NVARCHAR2_TYPE_2, CHARTYPE_1, CHARTYPE_2, INTERVALYEAR_TYPE_1, INTERVALDAY_TYPE_1, FLOAT_TYPE1, FLOAT_TYPE2)
    values (to_date('06-09-2011 09:30:20', 'dd-mm-yyyy hh24:mi:ss'), to_timestamp('13-10-2011 11:11:11.000000', 'dd-mm-yyyy hh24:mi:ss.ff'), to_timestamp('13-10-2012 11:11:11.000000', 'dd-mm-yyyy hh24:mi:ss.ff'), to_timestamp_tz('13-10-2013 11:11:11.000000 +08:00', 'dd-mm-yyyy hh24:mi:ss.ff tzh:tzm'), 33.3333333333333333333333333333333333333, 67, 11.11111, 22, '1  ,,,,2  ,,,,3  ,,,,', 'this is test, hope it works. this is te', 'NVARCHAR2_TYPE_1', 'NVARCHAR2_TYPE_2', 'abcdefghij', 'ABCDEF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              ', '+04-02', '+04 13:50:00.000000', 20.5555555555556, 30.66667);
    commit;
    prompt 1 records loaded
    prompt Enabling triggers for TYPE_WIDTH_TEST...
    alter table TYPE_WIDTH_TEST enable all triggers;
    set feedback on
    set define on
    prompt Done.

    附件 我修正的ociuldr的source
    /*我对 ociuldr.c 的getColumns() 做了修正, 集中在switch语句内部*/
    sword getColumns(FILE *fpctl, Lda_Def *lda,Cda_Def *cda, struct COLUMN *collist, int fixlen)
    {
        int totallen=1;
        sword col;
        struct COLUMN *tempcol;
        struct COLUMN *nextcol;
        ub1 *buf;

        nextcol = collist;
     
        /* Describe the select-list items. */
        if(fpctl != NULL) fprintf(fpctl,"(\n");
        for (col = 0; col < MAX_SELECT_LIST_SIZE; col++)
        {
            tempcol = (struct COLUMN *) malloc(sizeof(struct COLUMN));
            tempcol-> indp        = (sb2 *)malloc(DEFAULT_ARRAY_SIZE * sizeof(sb2));
            tempcol-> col_retlen  = (ub2 *)malloc(DEFAULT_ARRAY_SIZE * sizeof(ub2));
            tempcol-> col_retcode = (ub2 *)malloc(DEFAULT_ARRAY_SIZE * sizeof(ub2));

            tempcol->next = NULL;
            tempcol->colbuf = NULL;
                  memset(tempcol->fmtstr,0,64);

            tempcol->buflen = MAX_ITEM_BUFFER_SIZE;
            if (odescr(cda, col + 1, &(tempcol->dbsize),
                       &(tempcol->dbtype), &(tempcol->buf[0]),
                       &(tempcol->buflen), &(tempcol->dsize),
                       &(tempcol->precision), &(tempcol->scale),
                       &(tempcol->nullok)))
            {
                if(fpctl != NULL) fprintf(fpctl,"\n");
                free(tempcol);
                /* Break on end of select list. */
                if (cda->rc == VAR_NOT_IN_LIST)
                    break;
                else
                {
                    SQLError(lda,cda);
                    return -1;
                }
            }

            if(col)
            {
              if(fpctl != NULL) fprintf(fpctl,",\n");
            }

            nextcol->next = tempcol;
            nextcol=tempcol;

            nextcol->buf[nextcol->buflen]='\0';
     
            switch(nextcol->dbtype)
            {
                case DATE_TYPE: /*12*/
                    nextcol->dsize=19;
                    nextcol->dbtype=(fixlen?SQLT_AVC:STRING_TYPE);
                    if(fpctl != NULL)
                    {
                       if (fixlen)
                          fprintf(fpctl,"  %s POSITION(%d:%d) DATE \"YYYY-MM-DD HH24:MI:SS\"",
                              nextcol->buf , totallen, totallen + nextcol->dsize-1);
                       else
                          fprintf(fpctl,"  %s DATE \"YYYY-MM-DD HH24:MI:SS\"", nextcol->buf );
                    }
                    break;
                case 180: /* TIMESTAMP */
                    nextcol->dsize=26;
                    nextcol->dbtype=(fixlen?SQLT_AVC:STRING_TYPE);
                    if(fpctl != NULL)
                    {
                       if (fixlen)
                          fprintf(fpctl,"  %s POSITION(%d:%d) TIMESTAMP \"YYYY-MM-DD HH24:MI:SSXFF\"",
                               nextcol->buf , totallen, totallen + nextcol->dsize-1);
                       else
                          fprintf(fpctl,"  %s TIMESTAMP \"YYYY-MM-DD HH24:MI:SSXFF\"", nextcol->buf );
                    }
                    break;
                case 181: /* TIMESTAMP WITH TIMEZONE */
                    nextcol->dsize=33;
                    nextcol->dbtype=(fixlen?SQLT_AVC:STRING_TYPE);
                    if(fpctl != NULL)
                    {
                       if (fixlen)
                           fprintf(fpctl,"  %s POSITION(%d:%d) TIMESTAMP WITH TIME ZONE \"YYYY-MM-DD HH24:MI:SSXFF TZH:TZM\"",
                               nextcol->buf , totallen, totallen + nextcol->dsize-1);
                       else
                           fprintf(fpctl,"  %s TIMESTAMP WITH TIME ZONE \"YYYY-MM-DD HH24:MI:SSXFF TZH:TZM\"", nextcol->buf );
                    }
                    break;
                   
                case 231: /* TIMESTAMP WITH LOCAL TIME ZONE */
                    nextcol->dsize=33;
                    nextcol->dbtype=(fixlen?SQLT_AVC:STRING_TYPE);
                    if(fpctl != NULL)
                    {
                       if (fixlen)
                           fprintf(fpctl,"  %s POSITION(%d:%d) TIMESTAMP WITH LOCAL TIME ZONE \"YYYY-MM-DD HH24:MI:SSXFF TZH:TZM\"",
                               nextcol->buf , totallen, totallen + nextcol->dsize-1);
                       else
                           fprintf(fpctl,"  %s TIMESTAMP WITH LOCAL TIME ZONE \"YYYY-MM-DD HH24:MI:SSXFF TZH:TZM\"", nextcol->buf );
                    }
                    break;               
                case 24:  /* LONG RAW */
                case 113: /* BLOB */
                   nextcol->dsize=DEFAULT_LONG_SIZE;
                   nextcol->dbtype=(fixlen?SQLT_AVC:STRING_TYPE);
                   if(fpctl != NULL)
                   {
                      if (fixlen)
                          fprintf(fpctl,"  %s POSITION(%d:%d) CHAR(%d) ",
                              nextcol->buf, totallen, totallen + nextcol->dsize-1, DEFAULT_LONG_SIZE);
                      else
                          fprintf(fpctl,"  %s CHAR(%d) ", nextcol->buf, DEFAULT_LONG_SIZE);
                   }
                   break;
                case ROWID_TYPE:
                    nextcol->dsize=18;
                    nextcol->dbtype=(fixlen?SQLT_AVC:STRING_TYPE);
                    if(fpctl != NULL)
                    {
                        if (fixlen)
                           fprintf(fpctl, "  %s POSITION(%d:%d) CHAR(%d)",
                                nextcol->buf, totallen, totallen + nextcol->dsize-1, nextcol->dsize);
                        else
                           fprintf(fpctl, "  %s CHAR(%d)", nextcol->buf,nextcol->dsize);
                    }
                    break;
                case NUMBER_TYPE:
                   if (nextcol->precision)
                       nextcol->dsize=nextcol->precision+2;
                   else
                       nextcol->dsize=46;
                   nextcol->dbtype=(fixlen?SQLT_AVC:STRING_TYPE);
                   if(fpctl != NULL)
                   {
                      if (fixlen)
                          fprintf(fpctl,"  %s POSITION(%d:%d) CHAR(%d)",
                               nextcol->buf,totallen, totallen + nextcol->dsize-1, nextcol->dsize);
                      else
                          fprintf(fpctl,"  %s CHAR(%d)", nextcol->buf,nextcol->dsize);
                   }
                   break;
                case 112: /* CLOB */
                case 114: /* BFILE */
                   nextcol->dsize=DEFAULT_LONG_SIZE;
                   nextcol->dbtype=(fixlen?SQLT_AVC:STRING_TYPE);
                   if(fpctl != NULL)
                   {
                      if (fixlen)
                          fprintf(fpctl,"  %s POSITION(%d:%d) CHAR(%d)",
                              nextcol->buf, totallen, totallen + nextcol->dsize-1, DEFAULT_LONG_SIZE);
                      else
                          fprintf(fpctl,"  %s CHAR(%d)", nextcol->buf,DEFAULT_LONG_SIZE);
                   }
                   break;
                default:
                   nextcol->dbtype=(fixlen?SQLT_AVC:STRING_TYPE);
                     if (nextcol->dsize>4000) nextcol->dsize = 4000;
                     if (nextcol->dsize==0)  nextcol->dsize = 4000;
                   if(fpctl != NULL)
                   {
                      if (fixlen)
                         fprintf(fpctl,"  %s POSITION(%d:%d) CHAR(%d)", nextcol->buf,
                                totallen, totallen + nextcol->dsize-1,
                                (nextcol->dsize==0?DEFAULT_LONG_SIZE:nextcol->dsize));
                      else
                         fprintf(fpctl,"  %s CHAR(%d)", nextcol->buf,(nextcol->dsize==0?DEFAULT_LONG_SIZE:nextcol->dsize));
                   }
                   break;
            }
            /* Set for long type column */
            if (nextcol->dsize > DEFAULT_LONG_SIZE || nextcol->dsize == 0) 
                nextcol->dsize = DEFAULT_LONG_SIZE;
            /* add one more byte to store the ternimal char of string */
            sprintf(nextcol->fmtstr,"%%-%ds", nextcol->dsize);
            totallen = totallen + nextcol->dsize;
            nextcol->dsize ++;

            nextcol->colbuf = malloc(DEFAULT_ARRAY_SIZE * (nextcol->dsize));
            memset(nextcol->colbuf,0,DEFAULT_ARRAY_SIZE * (nextcol->dsize));

            if (odefin(cda, col + 1,
                       nextcol->colbuf, nextcol->dsize, nextcol->dbtype,
                       -1, nextcol->indp, (text *) 0, -1, -1,
                       nextcol->col_retlen,nextcol->col_retcode))
            {
                SQLError(lda,cda);
                return -1;
            }
        }
        if(fpctl != NULL)
           fprintf(fpctl,")\n");
        fputs("\n",(fp_log == NULL?stdout:fp_log));
        return col;
    }

  • 相关阅读:
    Java 中字符串的格式化
    Awk学习笔记
    Spring任务调度实战之Quartz Simple Trigger
    Subclipse vs. Subversive
    使用OpenSSL生成自用证书
    httpclient处理页面跳转
    巧用TableDiff(转)
    DBCC了解页面结构
    取最新一条SQL优化
    SQL 2008 到 SQL2012的镜像
  • 原文地址:https://www.cnblogs.com/harrychinese/p/Unload_Oracle_data_into_text_file_2.html
Copyright © 2011-2022 走看看