zoukankan      html  css  js  c++  java
  • Phoenix Tips (3) 加盐

    因为HBase 数据储存按照 row key 排序,如果HBase表的 row key 是单调递增的,则HBase 容易有RegionServer 的局部热点问题。加盐可以缓解这个问题。

    create table H3 (id varchar not null primary key, cf1.a varchar, cf2.b varchar) SALT_BUCKETS=20;
     
    只能在创建表格时候加,创建后不可更改。

    alter table h1 set salt_buckets=10;
    Error: ERROR 1024 (42Y83): Salt bucket number may only be specified when creating a table. tableName=H1


    加盐后的注意事项:

    a、sequential scan 返回的结果可能不是自然排序的,如果sequential scan使用了LIMIT语句,将与不加盐的情况不一样。

    b、 Spit point:If no split points are specified for the table, the salted table would be pre-split on salt bytes boundaries to ensure load distribution among region servers even during the initial phase of the table. If users are to provide split points manually, users need to include a salt byte in the split points they provide.

    c、Row Key 排序:Pre-spliting also ensures that all entries in the region server all starts with the same salt byte, and therefore are stored in a sorted manner. When doing a parallel scan across all region servers, we can take advantage of this properties to perform a merge sort of the client side. The resulting scan would still be return sequentially as if it is from a normal table

     


    实际上是改写了Row Key,添加了一个prefix

     

    new_row_key = (++index % BUCKETS_NUMBER) + original_key 

    数据存储到 Buckects_Number 个Bucket中 ,每个Bucket的Prefix 相同,在query的时候,同时在各个Bucket进行。

  • 相关阅读:
    bzoj4423 [AMPPZ2013]Bytehattan
    bzoj3643 Phi的反函数
    hdu1024 Max Sum Plus Plus的另一种解法
    hdu1024 Max Sum Plus Plus
    bzoj3638 Cf172 k-Maximum Subsequence Sum
    bzoj3620 似乎在梦中见过的样子
    bzoj3667 Rabin-Miller算法
    bzoj3680 吊打XXX
    阿里Linux Shell脚本面试25个经典问答
    编程面试的10大算法概念汇总
  • 原文地址:https://www.cnblogs.com/leeeee/p/7276378.html
Copyright © 2011-2022 走看看