zoukankan      html  css  js  c++  java
  • [数据分析]利用pandasticsearch批量读取ES

    1.git地址

    https://github.com/onesuper/pandasticsearch

    2.建立连接

    from pandasticsearch import DataFrame
    
    
    username = b'xxxx'
    password = b'xxxx'
    
    df = DataFrame.from_es(url='IP:9200',
                           index='x'x'x'x',
                           username=username,
                           password=password,
                           doc_type='x'x'x'x',
                           compat=5
                          )
    
    [注] 实测python3 会遇到编码问题
    TypeError: a bytes-like object is required, not 'str'
    
    

    3.修改源码

    将~/anaconda3/lib/python3.7/site-packages/pandasticsearch/client.py中

        59             if username is not None and password is not None:
        60                 base64creds = base64.b64encode('%s:%s' % (username,password))
        61                 req.add_header("Authorization", "Basic %s" % base64creds)
    

    修改为:

        if username is not None and password is not None:
            base64creds = bytes.decode(base64.b64encode(b'%s:%s' % (username,password)))
            req.add_header("Authorization", "Basic %s" % base64creds)
    

    4.批量查询数据

    limit()函数查询前20万条数据,to_pandas()转成pandas的dataframe

    pd_df = df.limit(200000).to_pandas()
    
  • 相关阅读:
    菜单范式
    PIC18F26K20
    单片机中串口通信模型
    STM8S103之GPIO
    STM8S103之ADC
    二叉树最近公共祖先
    全排列
    整数翻转
    完全二叉树节点个数
    二叉树的深度
  • 原文地址:https://www.cnblogs.com/skyell/p/11907627.html
Copyright © 2011-2022 走看看