zoukankan      html  css  js  c++  java
  • Apache Drill

    https://www.tutorialspoint.com/apache_drill/apache_drill_querying_data_using_hbase.htm

    HBase is a distributed column-oriented database built on top of the Hadoop file system. It is a part of the Hadoop ecosystem that provides random real-time read/write access to data in the Hadoop File System.One can store the data in HDFS either directly or through HBase.

    The following steps are used to query HBase data in Apache Drill.

    How to Start Hadoop and HBase?

    Step 1: Prerequisites

    Before moving on to querying HBase data, you must need to install the following −

    • Java installed version 1.7 or greater
    • Hadoop
    • HBase

    Step 2: Enable Storage Plugin

    After successful installation navigate to Apache Drill web console and select the storage menu option as shown in the following screenshot.

     
    Enable Storage Plugin

    Then choose HBase Enable option, after that go to the update option and now you will see the response as shown in the following program.

    {
       "type": "hbase",
       "config": {
          "hbase.zookeeper.quorum": "localhost",
          "hbase.zookeeper.property.clientPort": "2181"
       },
       "size.calculator.enabled": false,
       "enabled": true
    }

    Here the config settings “hbase.zookeeper.property.clientPort” : “2181” indicates ZooKeeper port id.

    In the embedded mode, it will automatically assign it to the ZooKeeper, but in the distributed mode, you must specify the ZooKeeper port id’s separately.

    Now, HBase plugin is enabled in Apache Drill.

    Step 3: Start Hadoop and HBase

    After enabling the plugin, first start your Hadoop server then start HBase.

    Creating a Table Using HBase Shell

    After Hadoop and HBase has been started, you can start the HBase interactive shell using “hbase shell” command as shown in the following query.

    Query

    /bin/hbase shell

    Then you will see the response as shown in the following program.

    Result

    hbase(main):001:0>
    

    To query HBase, you should complete the following steps −

    Create a Table

    Pipe the following commands to the HBase shell to create a “customer” table.

    Query

    hbase(main):001:0> create 'customers','account','address'

    Load Data into the Table

    Create a simple text file named “hbase-customers.txt” as shown in the following program.

    Example

    Example

    put 'customers','Alice','account:name','Alice'
    put 'customers','Alice','address:street','123 Ballmer Av'
    put 'customers','Alice','address:zipcode','12345'
    put 'customers','Alice','address:state','CA'
    put 'customers','Bob','account:name','Bob'
    put 'customers','Bob','address:street','1 Infinite Loop'
    put 'customers','Bob','address:zipcode','12345'
    put 'customers','Bob','address:state','CA'
    put 'customers','Frank','account:name','Frank'
    put 'customers','Frank','address:street','435 Walker Ct'
    put 'customers','Frank','address:zipcode','12345'
    put 'customers','Frank','address:state','CA'
    put 'customers','Mary','account:name','Mary'
    put 'customers','Mary','address:street','56 Southern Pkwy'
    put 'customers','Mary','address:zipcode','12345'
    put 'customers','Mary','address:state','CA'

    Now, issue the following command in hbase shell to load the data into a table.

    Query

    hbase(main):001:0> cat ../drill_sample/hbase/hbase-customers.txt | bin/hbase shell

    Query

    Now switch to Apache Drill shell and issue the following command.

    0: jdbc:drill:zk = local> select * from hbase.customers;

    Result

    +------------+---------------------+---------------------------------------------------------------------------+
    |  row_key   |        account      |                              address                                      |
    +------------+---------------------+---------------------------------------------------------------------------+
    | 416C696365 | {"name":"QWxpY2U="} | {"state":"Q0E=","street":"MTIzIEJhbGxtZXIgQXY=","zipcode":"MTIzNDU="}     |
    | 426F62     | {"name":"Qm9i"}     | {"state":"Q0E=","street":"MSBJbmZpbml0ZSBMb29w","zipcode":"MTIzNDU="}     |
    | 4672616E6B | {"name":"RnJhbms="} | {"state":"Q0E=","street":"NDM1IFdhbGtlciBDdA==","zipcode":"MTIzNDU="}     |
    | 4D617279   | {"name":"TWFyeQ=="} | {"state":"Q0E=","street":"NTYgU291dGhlcm4gUGt3eQ==","zipcode":"MTIzNDU="} |
    +------------+---------------------+---------------------------------------------------------------------------+
    

    The output will be 4 rows selected in 1.211 seconds.

    Apache Drill fetches the HBase data as a binary format, which we can convert into readable data using CONVERT_FROM function available in drill. Check and use the following query to get proper data from drill.

    Query

    0: jdbc:drill:zk = local> SELECT CONVERT_FROM(row_key, 'UTF8') AS customer_id,
    . . . . . . . . . . . > CONVERT_FROM(customers.account.name, 'UTF8') AS customers_name,
    . . . . . . . . . . . > CONVERT_FROM(customers.address.state, 'UTF8') AS customers_state,
    . . . . . . . . . . . > CONVERT_FROM(customers.address.street, 'UTF8') AS customers_street,
    . . . . . . . . . . . > CONVERT_FROM(customers.address.zipcode, 'UTF8') AS customers_zipcode
    . . . . . . . . . . . > FROM hbase.customers;

    Result

    +--------------+----------------+-----------------+------------------+--------------------+
    | customer_id  | customers_name | customers_state | customers_street | customers_zipcode  |
    +--------------+----------------+-----------------+------------------+--------------------+
    |    Alice     |     Alice      |       CA        | 123 Ballmer Av   |     12345          |
    |    Bob       |     Bob        |       CA        | 1 Infinite Loop  |     12345          |
    |    Frank     |     Frank      |       CA        | 435 Walker Ct    |     12345          |
    |    Mary      |     Mary       |       CA        | 56 Southern Pkwy |     12345          |
    +--------------+----------------+-----------------+------------------+--------------------+
  • 相关阅读:
    .NET正则基础之——平衡组
    网站架构探索负载均衡的方式
    Sql2005 全文索引详解
    构架师之路(4) 里氏代换原则(Liskov Substitution Principle, LSP)
    Ubuntu 9.04 server安装nginx+php(fastcgi)
    构架师之路(2) 开闭原则(OpenClosed Principle,OCP)
    SQL Server中常用全局变量介绍
    构架师之路(3) 1 IoC理论的背景
    linux ubuntu 命令集合
    理解nginx 和 php(fastcgi)的关系
  • 原文地址:https://www.cnblogs.com/panpanwelcome/p/13435124.html
Copyright © 2011-2022 走看看