zoukankan      html  css  js  c++  java
  • Hadoop添加LZO压缩支持

    启用lzo的压缩方式对于小规模集群是很有用处,压缩比率大概能降到原始日志大小的1/3。同时解压缩的速度也比较快。

    安装

    准备jar包

    1)先下载lzo的jar项目
    https://github.com/twitter/hadoop-lzo/archive/master.zip

    2)下载后的文件名是hadoop-lzo-master,它是一个zip格式的压缩包,先进行解压,然后用maven编译。生成hadoop-lzo-0.4.20。

    3)将编译好后的hadoop-lzo-0.4.20.jar 放入hadoop-2.7.2/share/hadoop/common/

    [root@bigdata-01 common]$ pwd
    /export/servers/hadoop-2.7.4/share/hadoop/common
    [root@bigdata-01 common]$ ls
    hadoop-lzo-0.4.20.jar

    4)scp同步hadoop-lzo-0.4.20.jar到其他节点

    配置

    1)core-site.xml增加配置支持LZO压缩

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <configuration>
    
    <property>
    <name>io.compression.codecs</name>
    <value>
    org.apache.hadoop.io.compress.GzipCodec,
    org.apache.hadoop.io.compress.DefaultCodec,
    org.apache.hadoop.io.compress.BZip2Codec,
    org.apache.hadoop.io.compress.SnappyCodec,
    com.hadoop.compression.lzo.LzoCodec,
    com.hadoop.compression.lzo.LzopCodec
    </value>
    </property>
    <property>
        <name>io.compression.codec.lzo.class</name>
        <value>com.hadoop.compression.lzo.LzoCodec</value>
    </property>
    
    </configuration>

    2)scp同步core-site.xml到其他节点

    测试

    1)启动hive创建lzo表

    CREATE TABLE lzo_test (
    id STRING,
    name STRING
    )
    partitioned by (
    dt STRING
    )
    row format delimited
    fields terminated by '	'
    STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
    OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";

    2)导入数据

    load data inpath '/xxx/xxx/2019-07-25' into table lzo_test partition(dt='2019-07-25');
  • 相关阅读:
    PAT 解题报告 1009. Product of Polynomials (25)
    PAT 解题报告 1007. Maximum Subsequence Sum (25)
    PAT 解题报告 1003. Emergency (25)
    PAT 解题报告 1004. Counting Leaves (30)
    【转】DataSource高级应用
    tomcat下jndi配置
    java中DriverManager跟DataSource获取getConnection有什么不同?
    理解JDBC和JNDI
    JDBC
    Dive into python 实例学python (2) —— 自省,apihelper
  • 原文地址:https://www.cnblogs.com/blazeZzz/p/11244543.html
Copyright © 2011-2022 走看看