zoukankan      html  css  js  c++  java
  • python2.7.12操作Hbase

    前置条件:您已经安装好Hbase、python2.7

    题外话:最好自己安装个虚拟环境,以下操作都是在虚拟环境中的

    (ma) hadoop@master:/usr/local/pycharm/bin$ sudo pip install thrift
    [sudo] password for hadoop:
    The directory '/home/hadoop/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
    The directory '/home/hadoop/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
    Collecting thrift
      Downloading thrift-0.10.0.zip (87kB)
        100% |████████████████████████████████| 92kB 415kB/s
    Requirement already satisfied: six>=1.7.2 in /usr/local/lib/python2.7/dist-packages (from thrift)
    Installing collected packages: thrift
      Running setup.py install for thrift ... done
    Successfully installed thrift-0.10.0
     
    (ma) hadoop@master:/usr/local/pycharm/bin$ sudo pip install hbase-thrift
    [sudo] password for hadoop:
    The directory '/home/hadoop/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
    The directory '/home/hadoop/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
    Collecting hbase-thrift
      Downloading hbase-thrift-0.20.4.tar.gz
    Requirement already satisfied: Thrift in /usr/local/lib/python2.7/dist-packages (from hbase-thrift)
    Requirement already satisfied: six>=1.7.2 in /usr/local/lib/python2.7/dist-packages (from Thrift->hbase-thrift)
    Installing collected packages: hbase-thrift
      Running setup.py install for hbase-thrift ... done
    Successfully installed hbase-thrift-0.20.4


    Hbase的bin目录下启动bin/./hbase-daemon.sh start thrift
    hadoop@master:/opt/Hadoop/hbase-1.3.1/bin$ ./hbase-daemon.sh start thrift
    启动pycharm
    注意在虚拟环境中启动,其它环境中有可能程序运行不了。
    (ma) hadoop@master:/usr/local/pycharm/bin$ ./pycharm.sh


    参考文档:http://www.cnblogs.com/hitandrew/archive/2013/01/21/2870419.html,此文档中有的例子运行有问题

    创建hbase表:

    from thrift import Thrift
    from thrift.transport import TSocket
    from thrift.transport import TTransport
    from thrift.protocol import TBinaryProtocol

    from hbase import Hbase
    from hbase.ttypes import *

    transport = TSocket.TSocket('localhost', 9090);

    transport = TTransport.TBufferedTransport(transport)

    protocol = TBinaryProtocol.TBinaryProtocol(transport);

    client = Hbase.Client(protocol)
    transport.open()


    contents = ColumnDescriptor(name='cf:', maxVersions=1)
    client.createTable('test', [contents])

    print client.getTableNames()


    输出内容:
    /usr/bin/python2.7 /home/py/PycharmProjects/ThirdTest/testThrift.py
    ['member', 'test']

    Process finished with exit code 0


    在hbase shell中用list查看有刚才创建的test.

    插入数据:

    from thrift import Thrift
    from thrift.transport import TSocket
    from thrift.transport import TTransport
    from thrift.protocol import TBinaryProtocol

    from hbase import Hbase

    from hbase.ttypes import *

    transport = TSocket.TSocket('localhost', 9090)

    transport = TTransport.TBufferedTransport(transport)

    protocol = TBinaryProtocol.TBinaryProtocol(transport)

    client = Hbase.Client(protocol)

    transport.open()

    row = 'row-key1'

    mutations = [Mutation(column="cf:a", value="1")]
    client.mutateRow('test', row, mutations)

    在hbase shell中用scan 'test'查看有刚才创建的test.

    hbase(main):001:0> scan 'test'
    ROW                   COLUMN+CELL                                               
     row-key1             column=cf:a, timestamp=1506406128150, value=1             
    1 row(s) in 0.3570 seconds


    获取一行数据:

    from thrift import Thrift
    from thrift.transport import TSocket
    from thrift.transport import TTransport
    from thrift.protocol import TBinaryProtocol

    from hbase import Hbase
    from hbase.ttypes import *

    transport = TSocket.TSocket('localhost', 9090)
    transport = TTransport.TBufferedTransport(transport)

    protocol = TBinaryProtocol.TBinaryProtocol(transport)

    client = Hbase.Client(protocol)

    transport.open()

    tableName = 'test'
    rowKey = 'row-key1'

    result = client.getRow(tableName, rowKey)
    print result
    for r in result:
        print 'the row is ' , r.row
        print 'the values is ' , r.columns.get('cf:a').value



    输出内容:

    /usr/bin/python2.7 /home/py/PycharmProjects/ThirdTest/getOneRow.py
    [TRowResult(columns={'cf:a': TCell(timestamp=1506406612641, value='2')}, row='row-key1')]
    the row is  row-key1
    the values is  2


    查询多行:
    from thrift import Thrift
    from thrift.transport import TSocket
    from thrift.transport import TTransport
    from thrift.protocol import TBinaryProtocol

    from hbase import Hbase
    from hbase.ttypes import *

    transport = TSocket.TSocket('localhost', 9090)
    transport = TTransport.TBufferedTransport(transport)

    protocol = TBinaryProtocol.TBinaryProtocol(transport)

    client = Hbase.Client(protocol)
    transport.open()


    tableName = 'test'
    id = client.scannerOpenWithStop(tableName,'','','')

    result2 = client.scannerGetList(id, 10)

    print result2

    输出内容:

    /usr/bin/python2.7 /home/py/PycharmProjects/ThirdTest/getMultiRow.py
    [TRowResult(columns={'cf:a': TCell(timestamp=1506406612641, value='2')}, row='row-key1'), TRowResult(columns={'cf:a': TCell(timestamp=1506406650902, value='2')}, row='row-key2')]

























  • 相关阅读:
    Pycharm中运行Python代码的几种方式
    Git同步Python代码
    抓包工具Charles的使用
    jmeter进行的接口测试和压力测试
    并发的HTTP请求,apache是如何响应的,以及如何调用php文件的
    http 请求头部解析
    display_errors","On");和error_reporting 区别和联系
    http
    curl
    正则 惰性和非惰性匹配
  • 原文地址:https://www.cnblogs.com/herosoft/p/8134173.html
Copyright © 2011-2022 走看看