hbase性能调优之压缩测试

zoukankan html css js c++ java

hbase性能调优之压缩测试

文章概述：

1、顺序写

2、顺序读

3、随机写

4、随机读

5、SCAN数据

0 性能测试工具

hbase org.apache.hadoop.hbase.PerformanceEvaluation

Usage: java org.apache.hadoop.hbase.PerformanceEvaluation

[--nomapred] [--rows=ROWS] [--table=NAME]

[--compress=TYPE] [--blockEncoding=TYPE] [-D<property=value>]* <command> <nclients>

Options:

nomapred Run multiple clients using threads (rather than use mapreduce)

rows Rows each client runs. Default: One million

sampleRate Execute test on a sample of total rows. Only supported by randomRead. Default: 1.0

table Alternate table name. Default: 'TestTable'

compress Compression type to use (GZ, LZO, ...). Default: 'NONE'

flushCommits Used to determine if the test should flush the table. Default: false

writeToWAL Set writeToWAL on puts. Default: True

presplit Create presplit table. Recommended for accurate perf analysis (see guide). Default: disabled

inmemory Tries to keep the HFiles of the CF inmemory as far as possible. Not guaranteed that reads are always served from memory. Default: false

latency Set to report operation latencies. Currently only supported by randomRead test. Default: False

Note: -D properties will be applied to the conf used.

For example:

-Dmapred.output.compress=true

-Dmapreduce.task.timeout=60000

Command:

filterScan Run scan test using a filter to find a specific row based on it's value (make sure to use --rows=20)

randomRead Run random read test

randomSeekScan Run random seek and scan 100 test

randomWrite Run random write test

scan Run scan test (read every row)

scanRange10 Run random seek scan with both start and stop row (max 10 rows)

scanRange100 Run random seek scan with both start and stop row (max 100 rows)

scanRange1000 Run random seek scan with both start and stop row (max 1000 rows)

scanRange10000 Run random seek scan with both start and stop row (max 10000 rows)

sequentialRead Run sequential read test

sequentialWrite Run sequential write test

Args:

nclients Integer. Required. Total number of clients (and HRegionServers)

running: 1 <= value <= 500

Examples:

To run a single evaluation client:

$ bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1

1 顺序写测试

测试基准：10个并发客户端，写入200万行数据

1.1 无压缩顺序写

hbase org.apache.hadoop.hbase.PerformanceEvaluation --rows=2000000 --nomapred --table=none_test randomRead 10

1.2 LZO顺序写

hbase org.apache.hadoop.hbase.PerformanceEvaluation --rows=2000000 --nomapred --compress=LZO --table=none_test randomRead 10

1.3 有无压缩对比

对比指标不压缩 LZO压缩

插入100万行数据平均时间

文件大小(1000万行数据) 19.2G 4.7G

2 顺序读测试

2.1 无压缩顺序读

2.2 LZO顺序读

2.3 有无压缩对比

参考文献：

[1] 性能调优 | HBase表操作使用LZO

查看全文

相关阅读:
C# NAudio录音和播放音频文件及实时绘制音频波形图（从音频流数据获取，而非设备获取）
C# NAudio录音和播放音频文件-实时绘制音频波形图（从音频流数据获取，而非设备获取）
C# 录音和播放录音-NAudio
转载：需求分析师和产品经理有什么区别？
商业分析师
 网络基础概念
 软件需求工程
 微信APP分析报告
 产品经理的工作职责
 如何编写产品分析报告

原文地址：https://www.cnblogs.com/riordon/p/LZO.html

对比指标	不压缩	LZO压缩
插入100万行数据平均时间
文件大小(1000万行数据)	19.2G	4.7G