General Notes
The primary client interface to HBase is the HTable class in the org.apache.hadoop.hbase.client package.
It provides the user with all the functionality needed to store and retrieve data from HBase as well as delete obsolete values and so on. Before looking at the various methods this class provides let us address some general aspects of its usage.
Things to Remember
1. Create HTable instances only once, usually when your application starts.
2. Create a separate HTable instance for every thread you execute (or use HTablePool).
3. Updates are atomic on a per row basis.
CRUD Operations
The initial set of basic operations are often referred to as CRUD, meaning "Create, Read, Update, and Delete".
HBase has a set of those and we will look into each of them subsequently. They are provided by the HTable class, and the remainder of this chapter will refer directly to the methods without specifically mentioning the containing class again.
Put Method
This group of operations can be split into separate types: those that work on single rows and those that work on lists of rows. Since the latter involves some more complexity we will look at each group separately. Along the way you will also be introduced to accompanying client API features.
Single Puts
The very first method you may want to know about is how to store data in HBase.
Put(byte[] row, long ts, RowLock rowLock) //可以指定, row key, time stamp, lock
代码实例如下, 主要就是定义Put, 然后add
import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.util.Bytes; import java.io.IOException; public class PutExample { public static void main(String[] args) throws IOException { Configuration conf = HBaseConfiguration.create(); HTable table = new HTable(conf, "testtable"); Put put = new Put(Bytes.toBytes("row1")); //Create put with specific row put.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), Bytes.toBytes("val1")); //Add a column, whose name is "colfam1:qual1", to the put put.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual2"), Bytes.toBytes("val2")); table.put(put); } }
Client-side Write Buffer
Each put operation is effectively an RPC[52] that is transferring data from the client to the server and back. This is OK for a low number of operations, but not for applications that need to store thousands of values per second into a table.
The HBase API comes with a built in client-side write buffer that collects put operations so that they are sent in one RPC call to the server(s). The global switch to control if it is used or not is
represented by the following methods:
void setAutoFlush(boolean autoFlush) boolean isAutoFlush()
By default the client-side buffer is not enabled. You activate the buffer by setting auto flush to false, by invoking:
table.setAutoFlush(false)
如果设为true(默认), 会关闭buffer, 每次put都会flush
When you want to force the data to be written you can call another API function:
void flushCommits() throws IOException
List of Puts
The client API has the ability to insert single Put instances as shown above, but it also has the advanced feature of batching operations together. This comes in the form of the following call:
void put(List<Put> puts) throws IOException
例子如下, 可以不断add put, 然后最后put(puts)
List<Put> puts = new ArrayList<Put>(); Put put1 = new Put(Bytes.toBytes("row1")); put1.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), Bytes.toBytes("val1")); puts.add(put1); Put put2 = new Put(Bytes.toBytes("row2")); put2.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), Bytes.toBytes("val2")); puts.add(put2); table.put(puts);
Atomic Compare-and-Set
There is a special variation of the put calls that warrants its own section: check and put. The method signature is:
boolean checkAndPut(byte[] row, byte[] family, byte[] qualifier, byte[] value, Put put) throws IOException
This call allows you to issue atomic, server-side mutations that are guarded by an accompanying check. If the check passes successfully the put operation is executed, otherwise it aborts the operation completely. It can be used to update data based on current, possibly related, values.
Such guarded operations are often used in systems that handle, for example, account balances, state transitions, or data processing.
The basic principle is that you read data at one point in time and process it. Once you are ready to write back the result you want to make sure that no other client has done the same already. You use the atomic check to compare that the value is not modified and therefore apply your value.
其实很简单, checkAndPut, 分两步check前3个参数的指向的cell的值和给出的value参数是否一致, 如果一致, 则执行put返回true, 否则不执行返回false
这个挺有用的, HBase只能保证单row的原子性, 所以对于多行操作, 如果想要保持一致性, 可以使用Compare-and-Set.
Put put1 = new Put(Bytes.toBytes("row1"));
put1.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), Bytes.toBytes("val1"));
//当value为null的时候,可以check这个colum是否存在, 不存在返回true, 并执行put1 boolean res1 = table.checkAndPut(Bytes.toBytes("row1"), Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), null, put1); System.out.println("Put applied: " + res1);
//这时候colfam1:qual1已经存在了, 所以返回false boolean res2 = table.checkAndPut(Bytes.toBytes("row1"), Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), null, put1); System.out.println("Put applied: " + res2);
Put put2 = new Put(Bytes.toBytes("row1")); put2.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual2"), Bytes.toBytes("val2"));
//前面colfam1:qual1已经置为val1, 所以put2被执行,colfam1:qual2 = val2 boolean res3 = table.checkAndPut(Bytes.toBytes("row1"),Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), Bytes.toBytes("val1"), put2); System.out.println("Put applied: " + res3);
//注意put3是对于row2的, 而不是row1 Put put3 = new Put(Bytes.toBytes("row2")); put3.add(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), Bytes.toBytes("val3"));
//注意,这里会报错, 因为check和put的必须是同一个row, HBase不支持不同的row boolean res4 = table.checkAndPut(Bytes.toBytes("row1"), Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), Bytes.toBytes("val1"), put3); System.out.println("Put applied: " + res4);
可以看到check和put的必须是同一个row, 因为这是要加锁的, check完必须加锁保证在put前该row没有变化, 所以无法支持check和put不同的row
Compare-and-Set (CAS) operations are very powerful, especially in distributed systems, with even more decoupled client processes. In providing these calls, HBase sets itself apart from other architectures that give no means to reason about concurrent updates performed by multiple, independent clients.
Get Method
The next step in a client API is to retrieve what was just saved. For that the HTable is providing you with the Get call and matching classes. The operations are split into those that operate on a single row or those that retrieve multiple rows in one call.
Single Gets
First the method that is used to retrieve specific values from an HBase table.
Result get(Get get) throws IOException
看下面的例子,
Configuration conf = HBaseConfiguration.create();
HTable table = new HTable(conf, "testtable");
Get get = new Get(Bytes.toBytes("row1")); get.addColumn(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"));
Result result = table.get(get); byte[] val = result.getValue(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1")); System.out.println("Value: " + Bytes.toString(val));
很简单, Get对象可以指定读某row, 但是HBase是column-based, 如果读整行是比较低效的, 所以一般都是读某个column famliy的column.
除了支持addColumn, 支持如下的add条件的函数
Get addFamily(byte[] family) Get addColumn(byte[] family, byte[] qualifier) Get setTimeRange(long minStamp, long maxStamp) throws IOException Get setTimeStamp(long timestamp) Get setMaxVersions() Get setMaxVersions(int maxVersions) throws IOException
另外, 因为HBase内部考虑通用性, 数据类型为byte[], 所以提高的各种byte[]和其他类型之间互转的函数.
static String toString(byte[] b) static boolean toBoolean(byte[] b) static long toLong(byte[] bytes) static float toFloat(byte[] bytes) static int toInt(byte[] bytes)
List of Gets
Another symmetry to the put() calls is that you can ask for more than one row using a single request. It allows you to quickly and efficiently retrieve related - but also complete random, if required - data from the remote servers.
byte[] cf1 = Bytes.toBytes("colfam1"); byte[] qf1 = Bytes.toBytes("qual1"); byte[] qf2 = Bytes.toBytes("qual2"); byte[] row1 = Bytes.toBytes("row1"); byte[] row2 = Bytes.toBytes("row2"); List<Get> gets = new ArrayList<Get>(); Get get1 = new Get(row1); get1.addColumn(cf1, qf1); gets.add(get1); Get get2 = new Get(row2); get2.addColumn(cf1, qf1); gets.add(get2); Get get3 = new Get(row2); get3.addColumn(cf1, qf2); gets.add(get3); Result[] results = table.get(gets); System.out.println("First iteration..."); for (Result result : results) { String row = Bytes.toString(result.getRow()); System.out.print("Row: " + row + ""); byte[] val = null; if (result.containsColumn(cf1, qf1)) { val = result.getValue(cf1, qf1); System.out.println("Value: " + Bytes.toString(val)); } if (result.containsColumn(cf1, qf2)) { val = result.getValue(cf1, qf2); System.out.println("Value: " + Bytes.toString(val)); } } System.out.println("Second iteration..."); for (Result result : results) { for (KeyValue kv : result.raw()) { System.out.println("Row: " + Bytes.toString(kv.getRow()) + " Value: " + Bytes.toString(kv.getValue())); } } list get代码很容易理解, 在得到results后, 可以通过getValue等封装好的API去取值
另一种取值的方式也很有意思, 可以直接通过result.raw(), 把结果当作原始的kv来读取.
Delete Method
You are now able to create, read and update data in HBase tables. What is left is being able to delete from it. And surely you may have guessed by now that the HTable is providing you with a method of exactly that name. Along with a matching class aptly named Delete.
Single Deletes
The variant of the delete() call that takes a single Delete instance is:
void delete(Delete delete) throws IOException 例子也很简单
Delete delete = new Delete(Bytes.toBytes("row1")); delete.setTimestamp(1); //Set timestamp for row deletes delete.deleteColumn(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1"), 1); //Delete specific version in one column. delete.deleteColumns(Bytes.toBytes("colfam2"),Bytes.toBytes("qual1")); //Delete all versions in one column delete.deleteColumns(Bytes.toBytes("colfam2"),Bytes.toBytes("qual3"), 15); //Delete the given and all older versions in one column delete.deleteFamily(Bytes.toBytes("colfam3")); //Delete entire family, all columns and versions delete.deleteFamily(Bytes.toBytes("colfam3"), 3); table.delete(delete); table.close();
Atomic Compare-and-Delete
You have seen in the section called “Atomic Compare-and-Set” how to use an atomic, conditional operation to insert data into a table. There is an equivalent call for deletes that give you access to server side, read-and-modify functionality:
boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier, byte[] value, Delete delete) throws IOException
参考checkAndPut, 原理一样
Batch Operations
You have seen how you can add, retrieve, and remove data from a table using single, or list based operations. In this section we will look at API calls to batch different operations across multiple rows.
前面我们看到的所有put, get, delete的list操作, 实际上也是基于batch实现的.
private final static byte[] ROW1 = Bytes.toBytes("row1"); private final static byte[] ROW2 = Bytes.toBytes("row2"); private final static byte[] COLFAM1 = Bytes.toBytes("colfam1"); private final static byte[] COLFAM2 = Bytes.toBytes("colfam2"); private final static byte[] QUAL1 = Bytes.toBytes("qual1"); private final static byte[] QUAL2 = Bytes.toBytes("qual2");
List<Row> batch = new ArrayList<Row>();
Put put = new Put(ROW2); put.add(COLFAM2, QUAL1, Bytes.toBytes("val5")); batch.add(put); Get get1 = new Get(ROW1); get1.addColumn(COLFAM1, QUAL1); batch.add(get1); Delete delete = new Delete(ROW1); delete.deleteColumns(COLFAM1, QUAL2); batch.add(delete); Get get2 = new Get(ROW2); get2.addFamily(Bytes.toBytes("BOGUS")); batch.add(get2); //Fail, column family BOGUS does not exist
Object[] results = new Object[batch.size()]; try { table.batch(batch, results); } catch (Exception e) { System.err.println("Error: " + e); } for (int i = 0; i < results.length; i++) { System.out.println("Result[" + i + "]: " + results[i]); }
简单的例子, 可以把对不同row的Put, Get, Delete操作都放到一个batch中处理.
注意不能将同一个row的put和delete放在一个batch里面, 另外batch操作不会使用client端的write buffer, 会直接发给server
Be aware that you should not mix a Delete and Put operation for the same row in one batch call. The operations will be applied in a different order from that guarantees the best performance, but also causes unpredictable results. In some cases you may see fluctuating results due to race conditions.
Write Buffer and Batching
When you use the batch() functionality the included Put instances will not be buffered using the client-side write buffer. The batch() calls are synchronous and send the operations directly to the servers, there is no delay or other intermediate processing used. This is different compared to the put() calls obviously, so choose which one you want to use carefully.
Row Locks
Mutating operations - like put(), delete(), checkAndPut(), and so on - are executed exclusively, which means in a serial fashion, for each row, to guarantee the row level atomicity.
The regions servers provide a row lock feature ensuring that only a client holding the matching lock can modify a row. In practice though most client applications do not provide an explicit lock but rely on the mechanism in place that guard each operation separately.
对于HBase的row level atomicity必须靠row locks来保证, 虽然系统本身提供了自动的lock机制, 但是也提供了显式的lock的调用接口.
啥时候用? 上面也写了You should avoid using row locks whenever possible.
Scans
After the basic CRUD type operations you will now be introduced to scans, a technique akin to cursors[55] in database systems, making use of the underlying sequential, sorted storage layout HBase is providing.
Introduction
Using the scan operations is very similar to the get() methods.
And again, in symmetry to all the other functions there is also a supporting class, named Scan. But since scans are similar to iterators you do not have a scan() call but rather a getScanner(), which returns the actual scanner instance you need to iterate over. The available methods are:
ResultScanner getScanner(Scan scan) throws IOException ResultScanner getScanner(byte[] family) throws IOException ResultScanner getScanner(byte[] family, byte[] qualifier) throws IOException
Scan类定义了Scan的条件, getScanner必须以一个scan类作为参数(上面后两个,系统还是会为你创建scan类对象的), 返回ResultScanner是个迭代器(iterators), 可以通过next()来获取数据.
Scan类的定义如下,
Scan() Scan(byte[] startRow, Filter filter) Scan(byte[] startRow) Scan(byte[] startRow, byte[] stopRow)
The start row is always inclusive, while the end row is exclusive. This is often expressed as [startRow, stopRow) in the interval notation
Scan和get一样支持如下更多的限定条件
Scan addFamily(byte [] family) Scan addColumn(byte[] family, byte[] qualifier)
Scan setTimeRange(long minStamp, long maxStamp) throws IOException Scan setTimeStamp(long timestamp) Scan setMaxVersions() Scan setMaxVersions(int maxVersions)
The ResultScanner Class
Scans do not ship all the matching rows in one RPC to the client but instead do this on a row basis. This obviously makes sense as rows could be very large and sending thousands, and most likely more, of them in one call would use up too many resources, and take a long time.
The ResultScanner converts the scan into a get-like operation, wrapping the Result instance for each row into an iterator functionality. It has few methods of its own:
Result next() throws IOException Result[] next(int nbRows) throws IOException void close() // release scanner
Make sure you release a scanner instance as timely as possible. An open scanner holds quite a few resources on the server side, which could accumulate to a large amount of heap space occupied.
Caching vs. Batching
So far each call to next() will be a separate RPC for each row – even when you use the next(int nbRows) method, because it is nothing else but a client side loop over next() calls. Obviously this is not very good for performance when dealing with small cells, thus it would make sense to fetch more than one row per RPC if possible. This is called scanner caching and is by default disabled.
每次next都要一次RPC的话, 效率是比较低, 尤其当row数据比较小的时候, 所以会有scanner caching的出现, 一次RPC可以对多条row, 这个可以配置, 需要适度, 否则调用时间和client的memory都会有问题.
So far you have learned to use the client-side scanner caching to make better use of bulk transfers between your client application and the remote regions servers.
There is an issue though that was mentioned in passing earlier: very large rows. Those - potentially – do not fit into the memory of the client process. HBase and its client API has an answer for that: batching. As opposed to caching, which operates on a row level, batching works on the column level instead. It controls how many columns are retrieved for every call to any of the next() functions provided by the ResultScanner instance. For example, setting the scan to use setBatch(5) would return five columns per Result instance.
Batch相反, 对应于非常大的row, 一个row需要分几次读, 以column为单位