zoukankan      html  css  js  c++  java
  • Hive 多分隔符的使用 (转载)

    方法一)通过org.apache.hadoop.hive.contrib.serde2.RegexSerDe格式的serde。

    1) 建表语句

    #指定以^|~作为分隔符

    CREATE TABlE tableex3(id STRING, name STRING)

    ROW FORMAT SERDE'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'

    WITH SERDEPROPERTIES (

    "input.regex" = "^(.)^|~(.)$"

    )

    STORED AS TEXTFILE;

    2) 准备数据

    1^|~wee

    2^|~do

    we^|~xml

    %^|~we

    3) 转载数据

    load data local inpath '/var/lib/hadoop-hdfs/tee.txt'into table tableex3;

    4) 验证:

    select * from tableex3;

    +--------------+----------------+--+

    | tableex3.id | tableex3.name |

    +--------------+----------------+--+

    | 1 | wee |

    | 2 | do |

    | we | xml |

    | % | we |

    | NULL | NULL |

    +--------------+----------------+--+

    方法二)通过org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe格式的serde。

    #指定以^|~作为分隔符

    CREATE TABLE multi_delim (col1 STRING, col2 STRING,Col3STRING) ROW FORMAT SERDE'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES("field.delim"="^|~");

    cat /var/lib/hadoop-hdfs/tee3.txt

    1^|~wee^|~hi

    2^|~do^|~where

    we^|~xml^|~rice

    %^|~we^|~^|

    load data local inpath '/var/lib/hadoop-hdfs/tee.txt'into table tableex3;

    select * from multi_delim;

    +-------------------+-------------------+-------------------+--+

    | multi_delim.col1 | multi_delim.col2 | multi_delim.col3 |

    +-------------------+-------------------+-------------------+--+

    | 1 | wee | hi |

    | 2 | do | where |

    | we | xml | rice |

    | % | we | ^| |

    | | NULL | NULL |

  • 相关阅读:
    P4047 部落划分
    P1440 求m区间的最小值
    P2880 平衡的阵容
    P2700 逐个击破
    P2814 家谱 map模版
    P4403 秦腾与教学评估
    无油无糖低脂酸奶芒果蛋糕
    紫薯铜锣烧
    Spring In Action ③
    Spring In Action ②
  • 原文地址:https://www.cnblogs.com/ilvutm/p/7704330.html
Copyright © 2011-2022 走看看