zoukankan html css js c++ java

hive 学习系列五（hive 和elasticsearch 的交互，很详细哦，我又来吹liubi了）

hive 操作elasticsearch

一，从hive 表格向elasticsearch 导入数据

1，首先，创建elasticsearch 索引，索引如下

curl -XPUT '10.81.179.209:9200/zebra_info_demo?pretty' -H 'Content-Type: application/json' -d'
{
    "settings": {
        "number_of_shards":5,
        "number_of_replicas":2
    },
    "mappings": {
         "zebra_info": {
              "properties": {
                    "name" : {"type" : "text"},
                    "type": {"type": "text"},
                    "province": {"type": "text"},
                    "city": {"type": "text"},
                    "citycode": {"type": "text", "index": "no"},
                    "district": {"type": "text"},
                    "adcode": {"type": "text", "index": "no"},
                    "township": {"type": "text"},
                    "bausiness_circle": {"type": "text"},
                    "formatted_address": {"type": "text"},
                    "location": {"type": "geo_point"},
                    "extensions": {
                      "type": "nested",
                      "properties": {
                        "map_lat": {"type": "double", "index": "no"},
                        "map_lng": {"type": "double", "index": "no"},
                        "avg_price": {"type": "double", "index": "no"},
                        "shops": {"type":"short", "index": "no"},
                        "good_comments": {"type":"short", "index": "no"},
                        "lvl": {"type":"short", "index": "no"},
                        "leisure_type": {"type": "text", "index": "no"},
                        "fun_type": {"type": "text", "index": "no"},
                        "numbers": {"type": "short", "index": "no"}
                       }
                   }
             }
        }
    }
}
'

2，查看elasticsearch版本，下载相应的elasticsearch-hive-hadoop jar 包

可以用如下命令查看elastic search 的版本
本文版本5.6.9

到如下maven 官网下载jar 包。
https://repo.maven.apache.org/maven2/org/elasticsearch/elasticsearch-hadoop-hive/
选择正确的版本即可。

3，把下载下来的jar 包上传到hdfs 路径下。

本文jar 包路径，hdfs:///udf/elasticsearch-hadoop-hive-5.6.9.jar

4，哦了，建表，用起来

DELETE jars;
add jar hdfs:///udf/elasticsearch-hadoop-hive-5.6.9.jar;
drop table zebra_info_demo;
CREATE EXTERNAL  TABLE zebra_info_demo(
name string,
`type` string,
province double,
city string,
citycode string,
district string,
adcode string,
township string,
business_circle string,
formatted_address string,
location string,
extensions STRUCT<map_lat:double, map_lng:double, avg_price:double, shops:smallint, good_comments:smallint, lvl:smallint, leisure_type:STRING, fun_type:STRING, numbers:smallint>
)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' 
TBLPROPERTIES('es.nodes' = '10.81.179.209:9200',
'es.index.auto.create' = 'false',
'es.resource' = 'zebra_info_demo/zebra_info',
'es.read.metadata' = 'true',
'es.mapping.names' = 'name:name, type:type, province:province, city:city, citycode:citycode, district:district, adcode:adcode, township:township, business_circle:business_circle, formatted_address:formatted_address, location:location, extensions:extensions');

5, 往里面填充数据，就O了。

INSERT INTO TABLE zebra_info_demo
SELECT 
a.name,
a.brands,
a.province,
a.city,
null as citycode,
null as district,
null as adcode,
null as township,
a.business_circle,
null as formatted_address,
concat(a.map_lat, ', ', a.map_lng) as `location`,
named_struct('map_lat', cast(a.map_lat as double), 'map_lng',cast(a.map_lng as double) ,'avg_price', cast(0 as DOUBLE), 'shops', 0S,  'good_comments', 0S, 'lvl', cast(a.lv1 as SMALLINT), 'leisure_type', '', 'fun_type', '', 'numbers', 0S) as extentions
from medicalsite_childclinic a;

运行结果：
部分截图

二，已知elasticsearch 索引，然后，建立hive 表格和elasticsearch 进行交互。可以join 哦，一个字，liubi

1,先看一下索引和数据

已知索引如下：

curl -XPUT  '10.81.179.209:9200/join_tests?pretty' -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "cities": {
      "properties": {
        "province": {
          "type": "string"
        },
        "city": {
          "type": "string"
        }
      }
    }
    }
  }
}
'

curl -XPUT  '10.81.179.209:9200/join_tests1?pretty' -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "shop": {
      "properties":{
        "name": {
          "type": "string"
        },
        "city": {
          "type": "string"
        }
      }
    }
   }
  }
}
'

数据如下：
join_test

join_test1

2，建立表格，写一堆有毒的sql 语句。

DELETE jars;
add jar hdfs:///udf/elasticsearch-hadoop-hive-5.6.9.jar;
create table join_tests(
    province string,
    city string
)STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' 
TBLPROPERTIES('es.nodes' = '10.81.179.209:9200',
'es.index.auto.create' = 'false',
'es.resource' = 'join_tests/cities',
'es.read.metadata' = 'true',
'es.mapping.names' = 'province:province, city:city');

create table join_tests1(
    name string,
    city string
)STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' 
TBLPROPERTIES('es.nodes' = '10.81.179.209:9200',
'es.index.auto.create' = 'false',
'es.resource' = 'join_tests1/shop',
'es.read.metadata' = 'true',
'es.mapping.names' = 'name:name, city:city');




SELECT 
    a.province,
    b.city,
    b.name
from join_tests a LEFT JOIN join_tests1 b on a.city = b.city;

3，运行结果

运行结果

结束语

推荐一个useful 的工具， apache Hue, 可以用来管理hdfs 文件，hive 操作。mysql 操作等。

查看全文

相关阅读:
DBHelper类
 户籍不在本市并已申请基本养老保险或基本医疗保险关系转移手续销户提取业务办理指南（试行）
Android 打开/播放电脑的音频/视频文件
 （技术贴）如何鉴定绿茶婊
 Swift初探（一）
关于android移动终端IM的一些问题
 C++中对象、引用、指针
 TextView
新API
开发者必备的6款源码搜索引擎

原文地址：https://www.cnblogs.com/unnunique/p/9362112.html