主要用来过滤剩下行键计数一类
KeyOnlyFilter
官方API解释如下:
A filter that will only return the key component of each KV (the value will be rewritten as empty).
This filter can be used to grab all of the keys without having to also grab the values.
此filter可以用来计数,但是效率没有FirstKeyOnlyFilter高
如果需要用FirstKeyOnlyFilter 可以参考我这篇关于FirstKeyOnlyFilter的文章,地址如下
http://blog.csdn.NET/liuxiaochen123/article/details/7878580
KeyOnlyFilter实例代码如下,比较简单,意思到就行
- <span style="font-size:12px;">public int getCount1() {
- long bef = System.currentTimeMillis();
- int i = 0;
- ResultScanner rs = null;
- try {
- Scan s = new Scan();
- s.setCaching(500);
- s.setCacheBlocks(false);
- s.setFilter(new KeyOnlyFilter());
- rs = tableKeyword.getScanner(s);
- } catch (IOException e) {
- log.warn(e);
- e.printStackTrace();
- }
- for (org.apache.hadoop.hbase.client.Result r : rs) {
- i++ ;
- }
- long now = System.currentTimeMillis();
- log.warn("keyword表中数据总数 :" + i + ", 所用时间 : " + (now - bef)/1000.0);
- rs.close();
- return i;
- }</span>
最好设置tableKeyword.setScannerCaching(500);
s.setCaching(500);
s.setCacheBlocks(false);这三个参数,否则速度会降下来很多
总的来说,可以节省很多时间