zoukankan      html  css  js  c++  java
  • hbase shell学习-2

    一个学生成绩表的例子来演示hbase的用法。
     

    name grade course
    math english
    Tom 5 97 87
    Jim 4 89 80

    表的创建:语法:create '表名称','列簇名称1','列簇名称2'........

    create 'student','name','grade','course'
    desc 'student'

    结果:

    {
        NAME=>'course',
        DATA_BLOCK_ENCODING=>'NONE',
        BLOOMFILTER=>'ROW',
        REPLICATION_SCOPE=>'0',
        VERSIONS=>'1',
        COMPRESSION=>'NONE',
        MIN_VERSIONS=>'0',
        TTL=>'FOREVER',
        KEEP_DELETED_CELLS=>'FALSE',
        BLOCKSIZE=>'65536',
        IN_MEMORY=>'false',
        BLOCKCACHE=>'true'
    }{
        NAME=>'grade',
        DATA_BLOCK_ENCODING=>'NONE',
        BLOOMFILTER=>'ROW',
        REPLICATION_SCOPE=>'0',
        VERSIONS=>'1',
        COMPRESSION=>'NONE',
        MIN_VERSIONS=>'0',
        TTL=>'FOREVER',
        KEEP_DELETED_CELLS=>'FALSE',
        BLOCKSIZE=>'65536',
        IN_MEMORY=>'false',
        BLOCKCACHE=>'true'
    }{
        NAME=>'name',
        DATA_BLOCK_ENCODING=>'NONE',
        BLOOMFILTER=>'ROW',
        REPLICATION_SCOPE=>'0',
        VERSIONS=>'1',
        COMPRESSION=>'NONE',
        MIN_VERSIONS=>'0',
        TTL=>'FOREVER',
        KEEP_DELETED_CELLS=>'FALSE',
        BLOCKSIZE=>'65536',
        IN_MEMORY=>'false',
        BLOCKCACHE=>'true'
    }

    新增列簇:

    alter '表名称',NAME='列簇名称'

    hbase(main):068:0> alter 'student',NAME=>'age'
    Updating all regions with the new schema...
    1/1 regions updated.

    删除列簇:

    alter '表名称',NAME=>'列簇名称',METHOD=>'delete'

     alter 'student',NAME=>'test',METHOD=>'delete'

    删除一个表:之前,必须先将该表disable掉。

    disable 'student'
    
    drop 'student'

    给表添加记录:

    put '表名称','rowkey','列簇名称:列名称','值'

    put 'student','001201509011001','name','Tom'

    结果:

    hbase(main):085:0> scan 'student'
    ROW                           COLUMN+CELL
     001201509011001              column=name:, timestamp=1447766388162, value=Tom
    1 row(s) in 0.0090 seconds

    继续执行:name列簇的value='jim',但是rowkey 不变,

    hbase(main):086:0> put 'student','001201509011001','name','Jim'

    结果还是一条数据,001201509011001的rowkey,被第二条数据覆盖。

    scan 'student'
    ROW                           COLUMN+CELL
     001201509011001              column=name:, timestamp=1447766492893, value=Jim
    put 'student','001201509011001','course:math','100'
    put 'student','001201509011001','course:english','100'
    hbase(main):096:0> scan 'student'
    ROW                           COLUMN+CELL
     001201509011001              column=course:english, timestamp=1447766828720, value=100
     001201509011001              column=course:math, timestamp=1447766813289, value=100
     001201509011001              column=grade:, timestamp=1447766751652, value=2
     001201509011001              column=name:, timestamp=1447766492893, value=Jim
     001201509011002              column=name:, timestamp=1447766547713, value=Tom


    加了很多rowKey进行测试:如下

    base(main):127:0> scan 'student'
    ROW                           COLUMN+CELL
     001201509011001              column=course:english, timestamp=1447766828720, value=100
     001201509011001              column=course:math, timestamp=1447766813289, value=100
     001201509011001              column=grade:, timestamp=1447766751652, value=2
     001201509011001              column=name:, timestamp=1447766492893, value=Jim
     001201509011002              column=course:english, timestamp=1447766987607, value=95
     001201509011002              column=course:math, timestamp=1447767003501, value=80
     001201509011002              column=grade:, timestamp=1447767073299, value=6
     001201509011002              column=name:, timestamp=1447766547713, value=Tom
     001201509011003              column=grade:, timestamp=1447767130750, value=5
     001201509011004              column=grade:, timestamp=1447767139371, value=3
     001201509011005              column=grade:, timestamp=1447767146338, value=3
     001201509011006              column=course:math, timestamp=1447767489278, value=30
     001201509011006              column=grade:, timestamp=1447767153088, value=2
     001201509011007              column=course:math, timestamp=1447767474245, value=87
     001201509011007              column=grade:, timestamp=1447767173296, value=2
     001201509011008              column=grade:, timestamp=1447767181639, value=3
     001201509011008              column=name:, timestamp=1447767278902, value=lucy
     001201509011009              column=grade:, timestamp=1447767190450, value=10
     001201509011009              column=name:, timestamp=1447767257259, value=Mike
     001201509011010              column=grade:, timestamp=1447767198644, value=11
     001201509011010              column=name:, timestamp=1447767236548, value=Peter

    根据rowkey查看对应列的数据:

    get '表名称','rowkey','列簇名称:列名称'

     get 'student','001201509011001','name'
    COLUMN                        CELL
     name:                        timestamp=1447766492893, value=Jim

    查看表中的记录数:根据列簇来统计:

    hbase(main):133:0* count 'student'

    结果:10

    查询表中指定列的所有记录:

    语法:scan '表名',{COLUMNS =>'列簇'}

    hbase(main):134:0> scan 'student',{COLUMNS=>'name'}
    ROW                           COLUMN+CELL
     001201509011001              column=name:, timestamp=1447766492893, value=Jim
     001201509011002              column=name:, timestamp=1447766547713, value=Tom
     001201509011008              column=name:, timestamp=1447767278902, value=lucy
     001201509011009              column=name:, timestamp=1447767257259, value=Mike
     001201509011010              column=name:, timestamp=1447767236548, value=Peter
    hbase(main):135:0> scan 'student',{COLUMNS=>'grade'}
    ROW                           COLUMN+CELL
     001201509011001              column=grade:, timestamp=1447766751652, value=2
     001201509011002              column=grade:, timestamp=1447767073299, value=6
     001201509011003              column=grade:, timestamp=1447767130750, value=5
     001201509011004              column=grade:, timestamp=1447767139371, value=3
     001201509011005              column=grade:, timestamp=1447767146338, value=3
     001201509011006              column=grade:, timestamp=1447767153088, value=2
     001201509011007              column=grade:, timestamp=1447767173296, value=2
     001201509011008              column=grade:, timestamp=1447767181639, value=3
     001201509011009              column=grade:, timestamp=1447767190450, value=10
     001201509011010              column=grade:, timestamp=1447767198644, value=11
    10 row(s) in 0.0220 seconds
    hbase(main):136:0> scan 'student',{COLUMNS=>'course'}
    ROW                           COLUMN+CELL
     001201509011001              column=course:english, timestamp=1447766828720, value=100
     001201509011001              column=course:math, timestamp=1447766813289, value=100
     001201509011002              column=course:english, timestamp=1447766987607, value=95
     001201509011002              column=course:math, timestamp=1447767003501, value=80
     001201509011006              column=course:math, timestamp=1447767489278, value=30
     001201509011007              column=course:math, timestamp=1447767474245, value=87
    4 row(s) in 0.0130 seconds

    查询表中指定区间的所有记录数:

    也可以指定一些修饰词:TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,or COLUMNS。没任何修饰词,就是上边例句,就会显示所有数据行。

    语法:scan '表名',{COLUMNS =>'列簇',LIMIT =>记录数,STARTROW=>'开始rowkey',STOPROW=>'结束rowkey'}

    取出name列族,前3条记录

    hbase(main):012:0> scan 'student',{COLUMNS=>['name'],LIMIT=>3}
    ROW                           COLUMN+CELL
     001201509011001              column=name:, timestamp=1447766492893, value=Jim
     001201509011002              column=name:, timestamp=1447766547713, value=Tom
     001201509011008              column=name:, timestamp=1447767278902, value=lucy

    取出name列族,前3条记录 rowkey[001201509011001,001201509011008) 左边闭合,右边开的数据。

    hbase(main):014:0> scan 'student',{COLUMNS=>['name'],LIMIT=>3,STARTROW=>'001201509011001',STOPROW=>'001201509011008'}
    ROW                           COLUMN+CELL
     001201509011001              column=name:, timestamp=1447766492893, value=Jim
     001201509011002              column=name:, timestamp=1447766547713, value=Tom

    指定两列:namegrade

    hbase(main):018:0> scan 'student',{COLUMNS=>['name','grade'],STARTROW=>'001201509011001',STOPROW=>'001201509011010'}
    ROW                           COLUMN+CELL
     001201509011001              column=grade:, timestamp=1447766751652, value=2
     001201509011001              column=name:, timestamp=1447766492893, value=Jim
     001201509011002              column=grade:, timestamp=1447767073299, value=6
     001201509011002              column=name:, timestamp=1447766547713, value=Tom
     001201509011003              column=grade:, timestamp=1447767130750, value=5
     001201509011004              column=grade:, timestamp=1447767139371, value=3
     001201509011005              column=grade:, timestamp=1447767146338, value=3
     001201509011006              column=grade:, timestamp=1447767153088, value=2
     001201509011007              column=grade:, timestamp=1447767173296, value=2
     001201509011008              column=grade:, timestamp=1447767181639, value=3
     001201509011008              column=name:, timestamp=1447767278902, value=lucy
     001201509011009              column=grade:, timestamp=1447767190450, value=10
     001201509011009              column=name:, timestamp=1447767257259, value=Mike

    可以根据 TIMERANGE查询:

    hbase(main):020:0> scan 'student',{COLUMNS=>['grade'],LIMIT => 3,TIMERANGE=>[1447766751652,1447767257259]}
    ROW                           COLUMN+CELL
     001201509011001              column=grade:, timestamp=1447766751652, value=2
     001201509011002              column=grade:, timestamp=1447767073299, value=6
     001201509011003              column=grade:, timestamp=1447767130750, value=5

    删除数据:

    rowkey=001201509011002 的数据有4列

    hbase(main):024:0> get 'student','001201509011002'
    COLUMN                        CELL
     course:english               timestamp=1447766987607, value=95
     course:math                  timestamp=1447767003501, value=80
     grade:                       timestamp=1447767073299, value=6
     name:                        timestamp=1447766547713, value=Tom

    删除一行数据:

    hbase(main):027:0> delete 'student','001201509011002','grade'
    hbase(main):028:0> get 'student','001201509011002'
    COLUMN                        CELL
     course:english               timestamp=1447766987607, value=95
     course:math                  timestamp=1447767003501, value=80
     name:                        timestamp=1447766547713, value=Tom

    查看那一行的数据:发现没有了。

    hbase(main):033:0> scan 'student',{COLUMNS=>['grade'],STARTROW=>'001201509011002',STOPROW=>'001201509011003'}
    ROW                           COLUMN+CELL
    0 row(s) in 0.0080 seconds

     http://www.cnblogs.com/ggjucheng/p/3379607.html  参考:HBase shell的基本用法

    http://blog.csdn.net/u010967382/article/category/2387735 参考 HBase基本数据操作详解【完整版,绝对精品】

     
     
     
     
  • 相关阅读:
    SlideShare
    准备SCJP考试
    Sun的过去
    shardingjdbc基础教程
    上万页大数据量的分页查询方案
    shardingjdbc教程 看这一篇就够了
    微服务化的认识
    JDK9对String底层存储的优化
    水平分表
    深入理解Java中的字段与属性的区别
  • 原文地址:https://www.cnblogs.com/200911/p/4972557.html
Copyright © 2011-2022 走看看