（11）ElasticSearch mapping解释与说明

zoukankan html css js c++ java

（11）ElasticSearch mapping解释与说明
　　在es中，执行一个PUT操作，es会自动创建索引，自动创建索引下的类型，其实es还创建了mapping。mappingd定义了type中的每一个字段的数据类型以及这些字段如何分词等相关属性。创建索引的时候，可以预先定义字段的类型以及相关属性，这样就能够把日期字段处理成日期，把数字字段处理成数字，把字符串字段处理成字符串值等。学习mapping先创建一个文档，如下：
PUT /myindex/article/1 { "post_date":"2018-05-10", "title":"Java", "content":"java is the best language", "author_id":119 }
　　查看mapping的语句：GET /myindex/article/_mapping。结果如下：
{ "myindex": { "mappings": { "article": { "properties": { "author_id": { "type": "long" }, "content": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "post_date": { "type": "date" }, "title": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } } } } } }
　　可以看到查询出了索引是myindex、类型是article。

　　author_id字段类型是long；content类型是text；post_date类型是date；title类型是text。es会自动识别字段类型。

　　es是支持数据类型的，它自动创建的映射是动态映射（dynamic mapping）。

　　es支持的数据类型如下：

　　（1）核心数据类型（Core datatype）

　　字符串：string，包括 text和keyword。text类型被用来索引长文本。在建立索引前会将这些文本进行分词，转化为词的组合。建立索引，允许es来检索这些词语。text类型不能用来排序和聚合。keyword类型不需要进行分词。可以被用来检索过滤、排序和聚合。keyword类型字段只能用本身来进行检索。

　　数字型：long、integer、short、byte、double、float

　　日期型：date

　　布尔型：boolean

　　二进制型：binary

　　日期、数值型不会分词，只能全部匹配查询，字符串可以分词，能模糊查询，举例如下：

　　添加如下两条数据，结合开始添加的数据，共3条数据：
PUT /myindex/article/2 { "post_date":"2018-05-12", "title":"html", "content":"i like html", "author_id":120 } PUT /myindex/article/3 { "post_date":"2018-05-16", "title":"es", "content":"Es is distributed document store", "author_id":110 }
　　执行查询，结果：

　　GET /myindex/article/_search?q=post_date:2018　　不会查出数据

　　GET /myindex/article/_search?q=post_date:2018-05　　不会查出数据

　　GET /myindex/article/_search?q=post_date:2018-05-10　　会查出数据

　　GET /myindex/article/_search?q=html　　会查出数据

　　GET /myindex/article/_search?q=java　　会查出数据

　　（2）复杂数据类型（Complex datatypes）

　　数组类型（Array datatype）:数组类型不需要专门指定数组元素的type，例如：

　　字符型数组：["one","two"]

　　整型数组：[1,2]

　　数组型数组：[1,[2,3]]等价于[1,2,3]

　　对象数组：[{"name":"Mary","age":12},{"name":"John","age":10}]

　　对象类型（Object datatype）：_object_用于单个json对象

　　嵌套类型（Nested datatype）: _nested_用于json数组

　　举例说明：
PUT /lib/person/1 { "name":"Tom", "age":25, "birthday":"1985-12-12", "address":{ "country":"china", "province":"guangdong", "city":"shenzhen" } }
　　底层存储格式为：
{ "name":["Tom"], "age":[25], "birthday":["1985-12-12"], "address.country":["china"], "address.province":["guangdong"], "address.city":["shenzhen"] }
PUT /lib/person/2 { "persons":[ {"name":"lisi","age":27}, {"name":"wangwu","age":26}, {"name":"zhangsan","age":23} ] }
　　底层存储格式为：
{ "persons.name":["lisi","wangwu","zhangsan"], "persons.age":[27,26,23] }
　　（3）地理位置类型（Geo datatypes）

　　地理坐标类型（Geo-point datatype）: _geo_point_用于经纬度坐标

　　地理形状类型（Geo-Shape datatype）：_geo_shape_用于类似于多边形的复杂形状

　　（4）特定类型（Specialised datatypes）
　　IPv4类型（IPv4 datatype）：_ip_用于IPv4地址

　　Completion类型（Completion datatype）: _completion_提供自动补全建议

　　Token count类型（Token count datatype）: _token_count_ 用于统计做了标记的字段的index数目，该值会一直增加，不会因为过滤条件而减少。

　　mapper-murmur3类型：通过插件，可以通过 _murmur3来计算index的hash值：

　　附加类型（Attachment datatype）:采用mapper-attachments插件，可支持_attachements_ 索引，如 Microsoft Office格式，Open Document格式，ePub,HTML等。

　　字段支持的属性：

　　"store": 字段上的值是不是被存储，如果没有存储就只能搜索，不能获取值，默认false，不存储

　　"index": true//分词,false//不分词，字段将不会被索引

　　"analyzer": "ik"//指定分词器，默认分词器为standard analyzer

　　"boost": 1.23//字段级别的分数加权，默认值是1.0

　　"ignore_above": 100//超过100个字符的文本，将会被忽略，不被索引

　　"search_analyzer": "ik"//设置搜索时的分词器，默认跟ananlyzer是一致的，比如index时用standard+ngram,搜索时用standard来完成自动提示功能。

　　手动创建mapping
put /lib { "settings":{ "number_of_shards":3, "number_of_replicas":0 }, "mappings":{ "books":{ "properties":{ "title":{"type":"text"}, "name":{"type":"text","analyzer":"standard"}, "publish_date":{"type":"date","index":false}, "price":{"type":"double"}, "number":{"type":"integer"} } } } }
　　指定了类型是books，字段name的分词器是analyzer，publish_date不使用分词索引。假如添加了一个新字段，新字段会按照默认的属性创建，如下：
PUT /lib/books/1 { "title":"java is good", "name":"java", "publish_date":"2019-01-12", "price":23, "number":46, "mark":"no" }
　　查看一下mapping情况：

　　GET lib/books/_mapping
{ "lib": { "mappings": { "books": { "properties": { "mark": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "name": { "type": "text", "analyzer": "standard" }, "number": { "type": "integer" }, "price": { "type": "double" }, "publish_date": { "type": "date", "index": false }, "title": { "type": "text" } } } } } }
查看全文

相关阅读:
Solidity通过合约转ERC20代币
 各种开源协议区别
 shell脚本之函数
 shell脚本之循环和循环控制
 shell脚本之if判断以及case多分支选择
 shell脚本之数组
 shell脚本之变量
 nginx常用内置变量
 nignx配置文件详解
 nginx源码安装./configure常见参数详解

原文地址：https://www.cnblogs.com/javasl/p/11405368.html