zoukankan      html  css  js  c++  java
  • python2.7.12操作Hbase

    前置条件:您已经安装好Hbase、python2.7

    题外话:最好自己安装个虚拟环境,以下操作都是在虚拟环境中的

    (ma) hadoop@master:/usr/local/pycharm/bin$ sudo pip install thrift
    [sudo] password for hadoop:
    The directory '/home/hadoop/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
    The directory '/home/hadoop/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
    Collecting thrift
      Downloading thrift-0.10.0.zip (87kB)
        100% |████████████████████████████████| 92kB 415kB/s
    Requirement already satisfied: six>=1.7.2 in /usr/local/lib/python2.7/dist-packages (from thrift)
    Installing collected packages: thrift
      Running setup.py install for thrift ... done
    Successfully installed thrift-0.10.0
     
    (ma) hadoop@master:/usr/local/pycharm/bin$ sudo pip install hbase-thrift
    [sudo] password for hadoop:
    The directory '/home/hadoop/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
    The directory '/home/hadoop/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
    Collecting hbase-thrift
      Downloading hbase-thrift-0.20.4.tar.gz
    Requirement already satisfied: Thrift in /usr/local/lib/python2.7/dist-packages (from hbase-thrift)
    Requirement already satisfied: six>=1.7.2 in /usr/local/lib/python2.7/dist-packages (from Thrift->hbase-thrift)
    Installing collected packages: hbase-thrift
      Running setup.py install for hbase-thrift ... done
    Successfully installed hbase-thrift-0.20.4


    Hbase的bin目录下启动bin/./hbase-daemon.sh start thrift
    hadoop@master:/opt/Hadoop/hbase-1.3.1/bin$ ./hbase-daemon.sh start thrift
    启动pycharm
    注意在虚拟环境中启动,其它环境中有可能程序运行不了。
    (ma) hadoop@master:/usr/local/pycharm/bin$ ./pycharm.sh


    参考文档:http://www.cnblogs.com/hitandrew/archive/2013/01/21/2870419.html,此文档中有的例子运行有问题

    创建hbase表:

    from thrift import Thrift
    from thrift.transport import TSocket
    from thrift.transport import TTransport
    from thrift.protocol import TBinaryProtocol

    from hbase import Hbase
    from hbase.ttypes import *

    transport = TSocket.TSocket('localhost', 9090);

    transport = TTransport.TBufferedTransport(transport)

    protocol = TBinaryProtocol.TBinaryProtocol(transport);

    client = Hbase.Client(protocol)
    transport.open()


    contents = ColumnDescriptor(name='cf:', maxVersions=1)
    client.createTable('test', [contents])

    print client.getTableNames()


    输出内容:
    /usr/bin/python2.7 /home/py/PycharmProjects/ThirdTest/testThrift.py
    ['member', 'test']

    Process finished with exit code 0


    在hbase shell中用list查看有刚才创建的test.

    插入数据:

    from thrift import Thrift
    from thrift.transport import TSocket
    from thrift.transport import TTransport
    from thrift.protocol import TBinaryProtocol

    from hbase import Hbase

    from hbase.ttypes import *

    transport = TSocket.TSocket('localhost', 9090)

    transport = TTransport.TBufferedTransport(transport)

    protocol = TBinaryProtocol.TBinaryProtocol(transport)

    client = Hbase.Client(protocol)

    transport.open()

    row = 'row-key1'

    mutations = [Mutation(column="cf:a", value="1")]
    client.mutateRow('test', row, mutations)

    在hbase shell中用scan 'test'查看有刚才创建的test.

    hbase(main):001:0> scan 'test'
    ROW                   COLUMN+CELL                                               
     row-key1             column=cf:a, timestamp=1506406128150, value=1             
    1 row(s) in 0.3570 seconds


    获取一行数据:

    from thrift import Thrift
    from thrift.transport import TSocket
    from thrift.transport import TTransport
    from thrift.protocol import TBinaryProtocol

    from hbase import Hbase
    from hbase.ttypes import *

    transport = TSocket.TSocket('localhost', 9090)
    transport = TTransport.TBufferedTransport(transport)

    protocol = TBinaryProtocol.TBinaryProtocol(transport)

    client = Hbase.Client(protocol)

    transport.open()

    tableName = 'test'
    rowKey = 'row-key1'

    result = client.getRow(tableName, rowKey)
    print result
    for r in result:
        print 'the row is ' , r.row
        print 'the values is ' , r.columns.get('cf:a').value



    输出内容:

    /usr/bin/python2.7 /home/py/PycharmProjects/ThirdTest/getOneRow.py
    [TRowResult(columns={'cf:a': TCell(timestamp=1506406612641, value='2')}, row='row-key1')]
    the row is  row-key1
    the values is  2


    查询多行:
    from thrift import Thrift
    from thrift.transport import TSocket
    from thrift.transport import TTransport
    from thrift.protocol import TBinaryProtocol

    from hbase import Hbase
    from hbase.ttypes import *

    transport = TSocket.TSocket('localhost', 9090)
    transport = TTransport.TBufferedTransport(transport)

    protocol = TBinaryProtocol.TBinaryProtocol(transport)

    client = Hbase.Client(protocol)
    transport.open()


    tableName = 'test'
    id = client.scannerOpenWithStop(tableName,'','','')

    result2 = client.scannerGetList(id, 10)

    print result2

    输出内容:

    /usr/bin/python2.7 /home/py/PycharmProjects/ThirdTest/getMultiRow.py
    [TRowResult(columns={'cf:a': TCell(timestamp=1506406612641, value='2')}, row='row-key1'), TRowResult(columns={'cf:a': TCell(timestamp=1506406650902, value='2')}, row='row-key2')]

























  • 相关阅读:
    oracle数据库导入导出命令!
    windows 7资源管理器崩溃解决方法
    迅雷和vs 2010的冲突
    当前网页正在试图打开您的受信任的站点列表中的站点,招人烦的alimama和淘宝
    <xhtmlConformance mode="Legacy"/>时,UpdatePanel会失效。
    头回遇见网上找不到的问题,“缺少实例ID,实例ID是必需的”
    修改基础表后,刷新关联视图的两种方法
    关于PetShop的一些记录。
    Linux poll机制分析
    volatile原理与技巧
  • 原文地址:https://www.cnblogs.com/herosoft/p/8134173.html
Copyright © 2011-2022 走看看