一个学生成绩表的例子来演示hbase的用法。
name | grade | course | |
math | english | ||
Tom | 5 | 97 | 87 |
Jim | 4 | 89 | 80 |
表的创建:语法:create '表名称','列簇名称1','列簇名称2'........
create 'student','name','grade','course'
desc 'student'
结果:
{ NAME=>'course', DATA_BLOCK_ENCODING=>'NONE', BLOOMFILTER=>'ROW', REPLICATION_SCOPE=>'0', VERSIONS=>'1', COMPRESSION=>'NONE', MIN_VERSIONS=>'0', TTL=>'FOREVER', KEEP_DELETED_CELLS=>'FALSE', BLOCKSIZE=>'65536', IN_MEMORY=>'false', BLOCKCACHE=>'true' }{ NAME=>'grade', DATA_BLOCK_ENCODING=>'NONE', BLOOMFILTER=>'ROW', REPLICATION_SCOPE=>'0', VERSIONS=>'1', COMPRESSION=>'NONE', MIN_VERSIONS=>'0', TTL=>'FOREVER', KEEP_DELETED_CELLS=>'FALSE', BLOCKSIZE=>'65536', IN_MEMORY=>'false', BLOCKCACHE=>'true' }{ NAME=>'name', DATA_BLOCK_ENCODING=>'NONE', BLOOMFILTER=>'ROW', REPLICATION_SCOPE=>'0', VERSIONS=>'1', COMPRESSION=>'NONE', MIN_VERSIONS=>'0', TTL=>'FOREVER', KEEP_DELETED_CELLS=>'FALSE', BLOCKSIZE=>'65536', IN_MEMORY=>'false', BLOCKCACHE=>'true' }
新增列簇:
alter '表名称',NAME='列簇名称'
hbase(main):068:0> alter 'student',NAME=>'age' Updating all regions with the new schema... 1/1 regions updated.
删除列簇:
alter '表名称',NAME=>'列簇名称',METHOD=>'delete'
alter 'student',NAME=>'test',METHOD=>'delete'
删除一个表:之前,必须先将该表disable掉。
disable 'student' drop 'student'
给表添加记录:
put '表名称','rowkey','列簇名称:列名称','值'
put 'student','001201509011001','name','Tom'
结果:
hbase(main):085:0> scan 'student' ROW COLUMN+CELL 001201509011001 column=name:, timestamp=1447766388162, value=Tom 1 row(s) in 0.0090 seconds
继续执行:name列簇的value='jim',但是rowkey 不变,
hbase(main):086:0> put 'student','001201509011001','name','Jim'
结果还是一条数据,001201509011001的rowkey,被第二条数据覆盖。
scan 'student' ROW COLUMN+CELL 001201509011001 column=name:, timestamp=1447766492893, value=Jim
put 'student','001201509011001','course:math','100'
put 'student','001201509011001','course:english','100'
hbase(main):096:0> scan 'student' ROW COLUMN+CELL 001201509011001 column=course:english, timestamp=1447766828720, value=100 001201509011001 column=course:math, timestamp=1447766813289, value=100 001201509011001 column=grade:, timestamp=1447766751652, value=2 001201509011001 column=name:, timestamp=1447766492893, value=Jim 001201509011002 column=name:, timestamp=1447766547713, value=Tom
加了很多rowKey进行测试:如下
base(main):127:0> scan 'student' ROW COLUMN+CELL 001201509011001 column=course:english, timestamp=1447766828720, value=100 001201509011001 column=course:math, timestamp=1447766813289, value=100 001201509011001 column=grade:, timestamp=1447766751652, value=2 001201509011001 column=name:, timestamp=1447766492893, value=Jim 001201509011002 column=course:english, timestamp=1447766987607, value=95 001201509011002 column=course:math, timestamp=1447767003501, value=80 001201509011002 column=grade:, timestamp=1447767073299, value=6 001201509011002 column=name:, timestamp=1447766547713, value=Tom 001201509011003 column=grade:, timestamp=1447767130750, value=5 001201509011004 column=grade:, timestamp=1447767139371, value=3 001201509011005 column=grade:, timestamp=1447767146338, value=3 001201509011006 column=course:math, timestamp=1447767489278, value=30 001201509011006 column=grade:, timestamp=1447767153088, value=2 001201509011007 column=course:math, timestamp=1447767474245, value=87 001201509011007 column=grade:, timestamp=1447767173296, value=2 001201509011008 column=grade:, timestamp=1447767181639, value=3 001201509011008 column=name:, timestamp=1447767278902, value=lucy 001201509011009 column=grade:, timestamp=1447767190450, value=10 001201509011009 column=name:, timestamp=1447767257259, value=Mike 001201509011010 column=grade:, timestamp=1447767198644, value=11 001201509011010 column=name:, timestamp=1447767236548, value=Peter
根据rowkey查看对应列的数据:
get '表名称','rowkey','列簇名称:列名称'
get 'student','001201509011001','name'
COLUMN CELL name: timestamp=1447766492893, value=Jim
查看表中的记录数:根据列簇来统计:
hbase(main):133:0* count 'student'
结果:10
查询表中指定列的所有记录:
语法:scan '表名',{COLUMNS =>'列簇'}
hbase(main):134:0> scan 'student',{COLUMNS=>'name'}
ROW COLUMN+CELL 001201509011001 column=name:, timestamp=1447766492893, value=Jim 001201509011002 column=name:, timestamp=1447766547713, value=Tom 001201509011008 column=name:, timestamp=1447767278902, value=lucy 001201509011009 column=name:, timestamp=1447767257259, value=Mike 001201509011010 column=name:, timestamp=1447767236548, value=Peter
hbase(main):135:0> scan 'student',{COLUMNS=>'grade'} ROW COLUMN+CELL 001201509011001 column=grade:, timestamp=1447766751652, value=2 001201509011002 column=grade:, timestamp=1447767073299, value=6 001201509011003 column=grade:, timestamp=1447767130750, value=5 001201509011004 column=grade:, timestamp=1447767139371, value=3 001201509011005 column=grade:, timestamp=1447767146338, value=3 001201509011006 column=grade:, timestamp=1447767153088, value=2 001201509011007 column=grade:, timestamp=1447767173296, value=2 001201509011008 column=grade:, timestamp=1447767181639, value=3 001201509011009 column=grade:, timestamp=1447767190450, value=10 001201509011010 column=grade:, timestamp=1447767198644, value=11 10 row(s) in 0.0220 seconds
hbase(main):136:0> scan 'student',{COLUMNS=>'course'} ROW COLUMN+CELL 001201509011001 column=course:english, timestamp=1447766828720, value=100 001201509011001 column=course:math, timestamp=1447766813289, value=100 001201509011002 column=course:english, timestamp=1447766987607, value=95 001201509011002 column=course:math, timestamp=1447767003501, value=80 001201509011006 column=course:math, timestamp=1447767489278, value=30 001201509011007 column=course:math, timestamp=1447767474245, value=87 4 row(s) in 0.0130 seconds
查询表中指定区间的所有记录数:
也可以指定一些修饰词:TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,or COLUMNS。没任何修饰词,就是上边例句,就会显示所有数据行。
语法:scan '表名',{COLUMNS =>'列簇',LIMIT =>记录数,STARTROW=>'开始rowkey',STOPROW=>'结束rowkey'}
取出name列族,前3条记录
hbase(main):012:0> scan 'student',{COLUMNS=>['name'],LIMIT=>3} ROW COLUMN+CELL 001201509011001 column=name:, timestamp=1447766492893, value=Jim 001201509011002 column=name:, timestamp=1447766547713, value=Tom 001201509011008 column=name:, timestamp=1447767278902, value=lucy
取出name列族,前3条记录 rowkey[001201509011001,001201509011008) 左边闭合,右边开的数据。
hbase(main):014:0> scan 'student',{COLUMNS=>['name'],LIMIT=>3,STARTROW=>'001201509011001',STOPROW=>'001201509011008'} ROW COLUMN+CELL 001201509011001 column=name:, timestamp=1447766492893, value=Jim 001201509011002 column=name:, timestamp=1447766547713, value=Tom
指定两列:name,grade
hbase(main):018:0> scan 'student',{COLUMNS=>['name','grade'],STARTROW=>'001201509011001',STOPROW=>'001201509011010'}
ROW COLUMN+CELL 001201509011001 column=grade:, timestamp=1447766751652, value=2 001201509011001 column=name:, timestamp=1447766492893, value=Jim 001201509011002 column=grade:, timestamp=1447767073299, value=6 001201509011002 column=name:, timestamp=1447766547713, value=Tom 001201509011003 column=grade:, timestamp=1447767130750, value=5 001201509011004 column=grade:, timestamp=1447767139371, value=3 001201509011005 column=grade:, timestamp=1447767146338, value=3 001201509011006 column=grade:, timestamp=1447767153088, value=2 001201509011007 column=grade:, timestamp=1447767173296, value=2 001201509011008 column=grade:, timestamp=1447767181639, value=3 001201509011008 column=name:, timestamp=1447767278902, value=lucy 001201509011009 column=grade:, timestamp=1447767190450, value=10 001201509011009 column=name:, timestamp=1447767257259, value=Mike
可以根据 TIMERANGE查询:
hbase(main):020:0> scan 'student',{COLUMNS=>['grade'],LIMIT => 3,TIMERANGE=>[1447766751652,1447767257259]}
ROW COLUMN+CELL 001201509011001 column=grade:, timestamp=1447766751652, value=2 001201509011002 column=grade:, timestamp=1447767073299, value=6 001201509011003 column=grade:, timestamp=1447767130750, value=5
删除数据:
rowkey=001201509011002 的数据有4列
hbase(main):024:0> get 'student','001201509011002' COLUMN CELL course:english timestamp=1447766987607, value=95 course:math timestamp=1447767003501, value=80 grade: timestamp=1447767073299, value=6 name: timestamp=1447766547713, value=Tom
删除一行数据:
hbase(main):027:0> delete 'student','001201509011002','grade'
hbase(main):028:0> get 'student','001201509011002' COLUMN CELL course:english timestamp=1447766987607, value=95 course:math timestamp=1447767003501, value=80 name: timestamp=1447766547713, value=Tom
查看那一行的数据:发现没有了。
hbase(main):033:0> scan 'student',{COLUMNS=>['grade'],STARTROW=>'001201509011002',STOPROW=>'001201509011003'} ROW COLUMN+CELL 0 row(s) in 0.0080 seconds
http://www.cnblogs.com/ggjucheng/p/3379607.html 参考:HBase shell的基本用法
http://blog.csdn.net/u010967382/article/category/2387735 参考 HBase基本数据操作详解【完整版,绝对精品】