什么是映射
类似于数据库中的表结构定义,主要作用如下:
- 定义Index下字段名(Field Name)
- 定义字段的类型,比如数值型,字符串型、布尔型等
- 定义倒排索引的相关配置,比如是否索引、记录postion等
需要注意的是,在索引中定义太多字段可能会导致索引膨胀,出现内存不足和难以恢复的情况,下面有几个设置:
- index.mapping.total_fields.limit:一个索引中能定义的字段的最大数量,默认是 1000
- index.mapping.depth.limit:字段的最大深度,以内部对象的数量来计算,默认是20
- index.mapping.nested_fields.limit:索引中嵌套字段的最大数量,默认是50
Mapping的数据类型
详见:https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html
基本数据类型
属性名字 | 说明 |
text |
用于全文索引,该类型的字段将通过分词器进行分词,最终用于构建索引 |
keyword | 不分词 |
long | 有符号64-bit integer:-2^63 ~ 2^63 - 1 |
integer | 有符号32-bit integer,-2^31 ~ 2^31 - 1 |
short | 有符号16-bit integer,-32768 ~ 32767 |
byte | 有符号8-bit integer,-128 ~ 127 |
double | 64-bit IEEE 754 浮点数 |
float | 32-bit IEEE 754 浮点数 |
half_float | 16-bit IEEE 754 浮点数 |
boolean | true,false |
date | https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html |
binary |
该类型的字段把值当做经过 base64 编码的字符串,默认不存储,且不可搜索 |
Mapping范围数据类型
标识一个数据范围而不是一个值 如age:10~20 搜索{"gle":5,"lte":20} 则可以搜索出来数据
支持的数据类型 | 说明 |
integer_range |
|
float_range |
|
long_range |
|
double_range |
|
date_range |
64-bit 无符号整数,时间戳(单位:毫秒) |
ip_range |
IPV4 或 IPV6 格式的字符串 |
可选参数:
relation这只匹配模式
INTERSECTS 默认的匹配模式,只要搜索值与字段值有交集即可匹配到
WITHIN 字段值需要完全包含在搜索值之内,也就是字段值是搜索值的子集才搜索出来
CONTAINS 与WITHIN相反,只搜索字段值包含搜索值的文档
测试
1.添加index
put:127.0.0.1:9200/range_test
{ "mappings": { "_doc": { "properties": { "count": { "type": "integer_range" }, "create_date": { "type": "date_range", "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis" } } } } }
2.添加测试数据
post:127.0.0.1:9200/range_test/_doc/1
{ "count" : { "gte" : 1, "lte" : 100 }, "create_date" : { "gte" : "2019-02-1 12:00:00", "lte" : "2019-03-30" } }
3.测试搜索
get:127.0.0.1:9200/range_test/_doc/_search
{ "query":{ "term":{ "count":5 } } }
5在1~100之间可以搜索出来
{ "query" : { "range" : { "create_date" : { "gte" : "2019-02-01", "lte" : "2019-03-30", "relation" : "within" } } } }
Mapping复杂数据类型
数组类型 Array
支持字符串 数值 object对象数组 数组元素必须为相同数据类型
对象类型 Object
{ "name": "小明", "user_info": { "student_id": 111, "class_info": { "class_name": "1年级" } } }
被索引形式
{ "name":"小明", "user_info.student_id":"111", "user_info.student_info.class_name":"111" }
嵌套类型 Nested
能够支持数组元素单独的做索引
查询api:https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html
聚合api:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-reverse-nested-aggregation.html
排序api:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-reverse-nested-aggregation.html
检索和高亮:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-inner-hits.html#nested-inner-hits
Nested和Object区别
put:127.0.0.1:9200/object_test/_doc/1 默认是object类型
{ "user_name":"小明", "subjects":[ {"subject_name":"地理","id":1}, {"subject_name":"英语","id":2} ] }
搜索名字为英语id为1的
{ "query":{ "bool":{ "must":[ {"match":{"subjects.subject_name":"英语"}}, {"match":{"subjects.id":"1"}} ] } } }
正常搜索不出来 测试时搜索出来了
因为索引为以下格式
{ "name":"小明", "subjects.subject_name":["英语","地理"], "subjects.subject_id":["1","2"] }
改为Nested 就不会
地理数据类型
geo_point
几种格式
object对象:"location": {"lat": 41.12, "lon": -71.34}
字符串:"location": "41.12,-71.34"
geohash:"location": "drm3btev3e86"
数组:"location": [ -71.34, 41.12 ]
geo_shape
查询api:https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-geo-bounding-box-query.html专用数据类型
- 记录IP地址 ip
- 实现自动补全 completion
- 记录分词数 token_count
- 记录字符串hash值 murmur3
- Percolator
Mapping设置
一个完整的mapping设置
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
{ "settings": { "analysis": { "analyzer": { "ik_pinyin_analyzer": { "type": "custom", "tokenizer": "ik_smart", "filter": ["my_pinyin"]#自定义filter }, "pinyin_analyzer": { "tokenizer": "shopmall_pinyin" }, "first_py_letter_analyzer": { "tokenizer": "first_py_letter" }, "full_pinyin_letter_analyzer": { "tokenizer": "full_pinyin_letter" }, "onlyOne_analyzer": { "tokenizer": "onlyOne_pinyin" } }, "tokenizer": {#自定义分词器 "onlyOne_pinyin": { "type":"pinyin", "keep_separate_first_letter": "false", "keep_first_letter":"false" }, "shopmall_pinyin": { "keep_joined_full_pinyin": "true", "keep_first_letter": "true", "keep_separate_first_letter": "false", "lowercase": "true", "type": "pinyin", "limit_first_letter_length": "16", "keep_original": "true", "keep_full_pinyin": "true", "keep_none_chinese_in_joined_full_pinyin": "true" }, "first_py_letter": { "type": "pinyin", "keep_first_letter": true, "keep_full_pinyin": false, "keep_original": false, "limit_first_letter_length": 16, "lowercase": true, "trim_whitespace": true, "keep_none_chinese_in_first_letter": false, "none_chinese_pinyin_tokenize": false, "keep_none_chinese": true, "keep_none_chinese_in_joined_full_pinyin": true }, "full_pinyin_letter": { "type": "pinyin", "keep_separate_first_letter": false, "keep_full_pinyin": false, "keep_original": false, "limit_first_letter_length": 16, "lowercase": true, "keep_first_letter": false, "keep_none_chinese_in_first_letter": false, "none_chinese_pinyin_tokenize": false, "keep_none_chinese": true, "keep_joined_full_pinyin": true, "keep_none_chinese_in_joined_full_pinyin": true } }, "filter": { "my_pinyin": { "type": "pinyin", "keep_joined_full_pinyin": true, "keep_separate_first_letter":true } } } }, "mappings": { "doc": {#type名字 "properties": {#mapping的属性 "productName": {属性名字 "type": "text",#属性类型 "analyzer": "ik_pinyin_analyzer",#分词器 "fields": {#fields 指定自定义分词器 查询时通过productName.keyword_once_pinyin 可以指定 "keyword_once_pinyin": { "type": "text", "analyzer": "onlyOne_analyzer"#指定的自定义分词器 } } }, "skuNames": { "type": "text", "analyzer": "ik_pinyin_analyzer", "fields": { "keyword_once_pinyin": { "type": "text", "analyzer": "onlyOne_analyzer" } } }, "regionCode": { "type": "keyword" }, "productNameSuggester": {#es6.x搜索建议实现 "type": "completion", "fields": { "pinyin": { "type": "completion", "analyzer": "pinyin_analyzer" }, "keyword_pinyin": { "type": "completion", "analyzer": "full_pinyin_letter_analyzer" }, "keyword_first_py": { "type": "completion", "analyzer": "first_py_letter_analyzer" } } } "info": {#es6父子类型设置 "type": "join", "relations": { "md_product":[ "sl_customer_character_order_list","ic_product_store_account","sl_customer_product_setting"] } } } } } }
创建mapping
put:http://127.0.0.1:9200/db
{
"mappings": {
"product": {//type
"properties": {
"productName": {//字段
"type": "text"//数据类型
}
}
}
}
}
mapping参数
参数 | 说明 |
analyzer | 分词器 默认:standard |
boost | 字段权重默认1 在通过_all字段查询 根据此字段来权重 |
dynamic | 控制字段新增 true(默认 允许新增) false strict 不能新增文档 |
index | 控制字段是否索引(可搜索) true 是 false否 可以节省 磁盘空间 |
参考:https://www.jianshu.com/p/e8a9feea683c
新增mapping字段
Elasticsearch的mapping一旦创建,只能增加字段,而不能修改已经mapping的字段
put http://127.0.0.1:9200/{indexName}/_mapping/{typeName} { "properties": { "productSortItemIds": { #字段名字 "type": "string",#类型 "store": true, #是否持久化 "analyzer": "comma", #分词器 "search_analyzer": "comma" #搜索分词器 } } }
修改字段类型
es不支持修改字段类型,因为es是根据lucene生成倒排索引 一旦生成不允许修改,如果需要修改 需要使用reindex重建索引 如果修改了类型会导致无法搜索,所以不支持,新增字段不受影响
新增分词器
post /{index}/_close #关闭索引 put /bbc_product/_settings #增加,号分词器 { "settings": { "analysis": { "analyzer": { "comma": { "type": "pattern", "pattern":"," } } } } } post /{index}/_open #开启索引
查看当前索引的映射
http://127.0.0.1:9200/blogs2/product/_mapping 不加/_mapping可看整个index设置
{
"blogs2": {
"mappings": {
"product": {
"properties": {
"price": {
"type": "long"
},
"productName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"remark": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"tags": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
自定义映射
作用定义数据类型 比如数字映射成text 大于小于范围搜索就会无效 还有明确哪些fullText需要分词哪些不需要分词
确切值(Exact values)和全文本(FullText)
es支持很多种数据类型但是主要分为2大类
确切值就是能够确定的值 比如id 日期 通过=就能查询到我们想要的数据
而全文本是需要进行相似度匹配 返回最佳匹配
Template Mapping
用于设置一个模板,避免新建索引 漏设置一些核心参数,当创建maping会自动将模板索引的设置根据优先级设置到创建的mapping
优先级 当前设置>模板maping order由小到大>默认
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
#数字字符串被映射成text,日期字符串被映射成日期 PUT ttemplate/_doc/1 { "someNumber":"1", "someDate":"2019/01/01" } GET ttemplate/_mapping #Create a default template PUT _template/template_default { "index_patterns": ["*"], "order" : 0, "version": 1, "settings": { "number_of_shards": 1, "number_of_replicas":1 } } PUT /_template/template_test { "index_patterns" : ["test*"], "order" : 1, "settings" : { "number_of_shards": 1, "number_of_replicas" : 2 }, "mappings" : { "date_detection": false, "numeric_detection": true } } #查看template信息 GET /_template/template_default GET /_template/temp* #写入新的数据,index以test开头 PUT testtemplate/_doc/1 { "someNumber":"1", "someDate":"2019/01/01" } GET testtemplate/_mapping get testtemplate/_settings PUT testmy { "settings":{ "number_of_replicas":5 } } put testmy/_doc/1 { "key":"value" } get testmy/_settings DELETE testmy DELETE /_template/template_default DELETE /_template/template_test #Dynaminc Mapping 根据类型和字段名 DELETE my_index PUT my_index/_doc/1 { "firstName":"Ruan", "isVIP":"true" } GET my_index/_mapping DELETE my_index PUT my_index { "mappings": { "dynamic_templates": [ { "strings_as_boolean": { "match_mapping_type": "string", "match":"is*", "mapping": { "type": "boolean" } } }, { "strings_as_keywords": { "match_mapping_type": "string", "mapping": { "type": "keyword" } } } ] } } DELETE my_index #结合路径 PUT my_index { "mappings": { "dynamic_templates": [ { "full_name": { "path_match": "name.*", "path_unmatch": "*.middle", "mapping": { "type": "text", "copy_to": "full_name" } } } ] } } PUT my_index/_doc/1 { "name": { "first": "John", "middle": "Winston", "last": "Lennon" } } GET my_index/_search?q=full_name:John