zoukankan html css js c++ java

3-索引、映射、分词器

一、索引映射mapping

1、kibana索引查询结果默认返回10条

2、自动映射：索引未创建mapping，直接往索引中直接插入数据，es会根据插入内容自动创建mapping

　手动映射：为索引手敲mapping

1 PUT product
2 {
3   "mappings": {
4     
5   }
6 }

二、nested

如果索引结构为复杂类型，properties，这时查询要使用nested

三、term和keyword的区别

term是一种搜索方式，精准查询，对搜索词不分词

keyword是一种数据类型，对元数据字段不分词

举例：

 1 索引：
 2 PUT index
 3 {
 4   "mappings": {
 5     "properties": {
 6       "name":{
 7         "type": "text",
 8         "analyzer": "standard",
 9         "fields": {
10           "keyword":{
11             "type":"keyword",
12             "ignore_above":256
13           }
14         }
15       }
16     }
17   }
18 }

 1 查询语句：
 2 GET product/_search
 3 {
 4   "query": {
 5     "term": {
 6       "name": {
 7         "value": "xiaomi phone"
 8       }
 9     }
10   }
11 }
12 该查询没有结果，因为使用term查询，导致查询条件xiaomi phone会被当做一个整体搜索条件
13 而搜索库中xiaomi phone会被拆分为xiaomi和phone与查询条件不符
14 
15 GET product/_search
16 {
17   "query": {
18     "term": {
19       "name.keyword": {
20         "value": "xiaomi phone"
21       }
22     }
23   }
24 }
25 该查询有结果，因为使用term，xiaomi phone被作为一整个查询条件
26 而name使用了keyword数据类型，导致xiaomi phone也为一整个搜索词
27 
28 GET product/_search
29 {
30   "query": {
31     "match": {
32       "name": {
33         "value": "xiaomi phone"
34       }
35     }
36   }
37 }
38 该查询有结果，该查询比较特殊，查询条件没有使用term而使用了match
39 字段没有使用keyword会被分词，但是该查询的查询条件被分词后与搜索库中词被分此后完美匹配

四、索引模板

 制定模板
 1 PUT dynamic_template
 2 {
 3   "mappings": {
 4     "dynamic_templates": [
 5       {
 6         "integers": {
 7           "match_mapping_type": "long",
 8           "mapping": {
 9             "type": "integer"
10           }
11         }
12       },                                    会将long类型的数据变为integer类型
13       {
14         "longs_as_strings": {
15           "match_mapping_type": "string",
16           "match": "long_*",
17           "unmatch": "*_text",
18           "mapping": {
19             "type": "long"
20           }
21         }
22       }                                     会将string类型的，字段名以long_开头，字段名不以_text结尾的数据变为long类型
23     ]
24   }
25 }

1 插入数据
2 POST dynamic_template/_doc
3 {
4   "age":123123,
5   "long_asssss":"456456",
6   "asasa_text":"789789"
7 }

 1 查看mapping
 2 GET dynamic_template/_mapping
 3 结果
 4 {
 5   "dynamic_template" : {
 6     "mappings" : {
 7       "dynamic_templates" : [
 8         {
 9           "integers" : {
10             "match_mapping_type" : "long",
11             "mapping" : {
12               "type" : "integer"
13             }
14           }
15         },
16         {
17           "longs_as_strings" : {
18             "match" : "long_*",
19             "unmatch" : "*_text",
20             "match_mapping_type" : "string",
21             "mapping" : {
22               "type" : "long"
23             }
24           }
25         }
26       ],
27       "properties" : {
28         "age" : {
29           "type" : "integer"
30         },
31         "asasa_text" : {
32           "type" : "text",
33           "fields" : {
34             "keyword" : {
35               "type" : "keyword",
36               "ignore_above" : 256
37             }
38           }
39         },
40         "long_asssss" : {
41           "type" : "long"
42         }
43       }
44     }
45   }
46 }

五、reindex

复制索引，源索引必须存在，目标索引必须不存在

1 POST _reindex
2 {
3   "source": {
4     "index": "order"
5   },
6   "dest": {
7     "index": "order_new"
8   }
9 }

注意：此处源索引的mapping和setting信息不会被带到新索引中

六、索引模板

 1 PUT _index_template/template_1
 2 {
 3   "index_patterns": ["te*", "bar*"],
 4   "template": {
 5     "settings": {
 6       "number_of_shards": 1
 7     },
 8     "mappings": {
 9       "_source": {
10         "enabled": true
11       },
12       "properties": {
13         "host_name": {
14           "type": "keyword"
15         },
16         "created_at": {
17           "type": "date",
18           "format": "EEE MMM dd HH:mm:ss Z yyyy"
19         }
20       }
21     },
22     "aliases": {
23       "mydata": { }
24     }
25   },
26   "priority": 500,
27   "composed_of": ["component_template1", "runtime_component_template"], 
28   "version": 3,
29   "_meta": {
30     "description": "my custom"
31   }
32 }

七、分词器

考试不考中文分词器

1、tokenizer：切词，按照某种规则将词语切成一个一个的词项(standard、whitespace)

2、filter：对词项进行处理，如大小写转换、去空格

 1 PUT my_index
 2 {
 3   "settings": {
 4     "analysis": {
 5       "analyzer": {
 6         "my_custom_analyzer":{
 7           "type":"custom",
 8           "tokenizer":"standard",
 9           "filter":[
10             "lowercase",
11             "my_synonym_filter"
12           ]
13         }
14       },
15       "filter": {
16         "my_synonym_filter":{
17           "type":"synonym",
18           "synonyms":[
19             "a ,b => lyc"    同义词转换：a或b都等同于lyc，但lyc不等同于a或b
20           ]                  还有一种写法a，b，lyc表示，搜索三者中任意一个其他两个都可以搜索到
21         }
22       }
23     }
24   }
25 }

3、char_filter：过滤字符的规则，可以自定义

 1 PUT my_index
 2 {
 3   "settings": {
 4     "analysis": {
 5       "char_filter": {
 6         "my_char_filter":{
 7           "type":"mapping",
 8           "mappings":[
 9             "` => 0",
10             "! => 1"
11           ]
12         }
13       },
14       "analyzer": {
15         "my_analyzer":{
16           "tokenizer":"keyword",
17           "char_filter":[
18             "my_char_filter"
19           ]
20         }
21       }
22     }
23   }
24 }
25 
26 进行验证，可以得到结果：my 0 1
27 POST my_index/_analyze
28 {
29   "analyzer": "my_analyzer",
30   "text":"my ` !"
31 }

考试中遇到自定义分词器去Text analysis->Configure text analysis->Create a custom analyzer中直接复制粘贴

作者:http://cnblogs.com/lyc-code/
本文版权归作者和博客园共有，欢迎转载，但未经作者同意必须保留此段声明，且在文章页面明显位置给出原文链接，否则保留追究法律责任的权力。

查看全文

相关阅读:
java8流处理，不生产博客，做个好博客的搬运工
 java.util.ConcurrentModificationException异常分析
 App登录状态维持
 tomcat没有发布maven项目依赖的本地jar包
 Json对象和Json字符串的区别
 java过滤关键词
 过滤3个字节以上的utf-8字符
 Incorrect string value: 'xF0x9Fx98x84xF0x9F
SpringBoot配置属性之DataSource
linux nohup命令使程序在后台运行的方法

原文地址：https://www.cnblogs.com/lyc-code/p/15463474.html