zoukankan      html  css  js  c++  java
  • ElasticSearch(五):Mapping和常见字段类型

    ElasticSearch(五):Mapping和常见字段类型

    学习课程链接《Elasticsearch核心技术与实战》


    ## 什么是Mapping * Mapping类似数据库中的schema的定义,作用如下: - 定义索引中的字段的名称; - 定义字段的数据类型,例如字符串、数字、日期、布尔等; - 对每个字段进行倒排索引的相关配置(Analyzed or Not Analyzed,Analyzer); * Mapping 会把JSON文旦映射成Lucene所需要的扁平格式。 * 一个Mapping属于一个索引的Type: - 每个文档都属于一个Type; - 一个Tpye有一个Mapping定义; - 7.0开始,不需要再Mapping定义中指定type信息;
    ## 字段的数据类型 * 简单类型 - Text - Date - Integer/Long/Floating - Boolean - IP4&IP6 - Keyword * 复杂类型 - 对象类型 - 嵌套类型 * 特殊类型(地理信息) - geo_point&geo_shape、percolator
    ## 什么是Dynamic Mapping * 在写入文档的时候,如果索引不存在,则会自动创建索引; * Dynamic Mapping机制,可以无需手动定义Mapping,ElasticSearch会自动根据文档信息,推算出字段的类型; * 但是有时候推算的可能不对,例如地理位置信息; * 当类型设置的不对时,会导致一些功能无法正常运行,比如范围内的Range查询;
    ## 类型的自动识别 JSON类型|Elasticsearch类型 ---|--- 字符串|匹配日期格式,设置成Date;匹配数字设置成Float或者Long,该选项默认关闭;设置为Text,并且增加keyword子字段 布尔值|Boolean 浮点数|Float 整数|Long 对象|Object 数组|由第一个非空数的类型所决定 空值|忽略
    ``` #写入文档,查看 Mapping PUT mapping_test/_doc/1 { "firstName":"Chan", "loginDate":"2018-07-24T10:29:48.103Z", "uid" : "123", "isVip" : false, "isAdmin": "true", "age":19, "heigh":180 }

    Delete index

    DELETE mapping_test

    查看 Dynamic Mapping文件

    GET mapping_test/_mapping

    
    

    查看 Dynamic Mapping返回结果

    {
    "mapping_test" : {
    "mappings" : {
    "properties" : {
    "age" : {
    "type" : "long" # "age":19,设置为long
    },
    "firstName" : {
    "type" : "text", # "firstName":"Chan",设置为Text,并且增加keyword子字段
    "fields" : {
    "keyword" : {
    "type" : "keyword",
    "ignore_above" : 256
    }
    }
    },
    "heigh" : {
    "type" : "long" #"heigh":180设置为long
    },
    "isAdmin" : {
    "type" : "text", #"isAdmin": "true",设置为Text,并且增加keyword子字段
    "fields" : {
    "keyword" : {
    "type" : "keyword",
    "ignore_above" : 256
    }
    }
    },
    "isVip" : {
    "type" : "boolean" #"isVip" : false,设置为boolean
    },
    "loginDate" : {
    "type" : "date" #"loginDate":"2018-07-24T10:29:48.103Z",设置为Date
    },
    "uid" : {
    "type" : "text", # "uid" : "123",设置为Text,并且增加keyword子字段,匹配数字设置成Float或者Long,该选项默认关闭;
    "fields" : {
    "keyword" : {
    "type" : "keyword",
    "ignore_above" : 256
    }
    }
    }
    }
    }
    }
    }

    
    
    
    <br/>
    ## 能否更改 Mapping 的字段类型
    分两种情况:
    * 新增加字段
      - Dynamic设置为true时,一旦有新增字段的文档写入,Mapping也同时被更新;
      - Dynamic设置为false时,Mapping不会被更新,新增字段的数据无法被索引,但是信息会出现在_source中;
      - Dynamic设置为strict时,文档写入失败;
    * 对已有字段,一旦已有数据写入,就不在支持修改字段定义
      - Lucene实现的倒排索引,一旦生成后,就不允许修改
      - 如果希望修改字段类型,必须Reindex API,重建索引
      - 如果修改了字段的数据类型,会导致已被索引的数据无法被搜索
      
    
    
    <br/>
    ## 控制Dynamic Mappings
    dynamic|true|false|strict
    ---|---|---|---
    文档可索引|YES|YES|NO
    字段可索引|YES|NO|NO
    Mapping被更新|YES|NO|NO
    <br/>
    * 当dynamic被设置成false时,存在新增字段数据写入,该数据可以被索引,但新增字段被丢弃
    * 当dynamic被设置成strict时,数据写入直接出错
    
    

    1.默认Mapping支持dynamic,写入的文档中加入新的字段

    PUT dynamic_mapping_test/_doc/1
    {
    "newField":"someValue"
    }

    
    

    2.该字段可以被搜索,数据也在_source中出现

    POST dynamic_mapping_test/_search
    {
    "query":{
    "match":{
    "newField":"someValue"
    }
    }
    }

    返回结果:

    {
    "took" : 5,
    "timed_out" : false,
    "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
    },
    "hits" : {
    "total" : {
    "value" : 1,
    "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
    {
    "_index" : "dynamic_mapping_test",
    "_type" : "_doc",
    "_id" : "1",
    "_score" : 0.2876821,
    "_source" : {
    "newField" : "someValue"
    }
    }
    ]
    }
    }

    
    

    3.修改为dynamic false

    PUT dynamic_mapping_test/_mapping
    {
    "dynamic": false
    }

    
    

    4.新增 anotherField

    PUT dynamic_mapping_test/_doc/10
    {
    "anotherField":"someValue"
    }

    
    

    5.该字段不可以被搜索,因为dynamic已经被设置为false

    POST dynamic_mapping_test/_search
    {
    "query":{
    "match":{
    "anotherField":"someValue"
    }
    }
    }

    返回结果:

    {
    "took" : 657,
    "timed_out" : false,
    "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
    },
    "hits" : {
    "total" : {
    "value" : 0,
    "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
    }
    }

    
    

    6.修改为strict

    PUT dynamic_mapping_test/_mapping
    {
    "dynamic": "strict"
    }

    
    

    7.写入数据出错,HTTP Code 400

    PUT dynamic_mapping_test/_doc/12
    {
    "lastField":"value"
    }

    返回结果:

    {
    "error": {
    "root_cause": [
    {
    "type": "strict_dynamic_mapping_exception",
    "reason": "mapping set to strict, dynamic introduction of [lastField] within [_doc] is not allowed"
    }
    ],
    "type": "strict_dynamic_mapping_exception",
    "reason": "mapping set to strict, dynamic introduction of [lastField] within [_doc] is not allowed"
    },
    "status": 400
    }

    
    
    
    <br/>
    ## 如何定义一个 Mapping
    

    PUT index_name
    {
    "mappings":{
    "properties":{
    //define your mappings here
    }
    }
    }

    
    * 可以参考API手册,纯手写;
    * 为了减少输入的工作量,减少出错概率,可以依照以下步骤:
      - 创建一个临时的index,写入一些样本数据;
      - 通过访问Mapping API获取该临时文件的动态Mapping定义;
      - 修改后,使用该配置创建你的索引
      - 删除临时索引
      
    	
    <br/>
    ## Mapping的一些配置
    * ` index`控制当前字段是否被索引,默认为`true`。如果设置成`false`,该字段不可被搜索。
    * `index_options`可以控制倒排索引记录的内容,有四种不同级别的配置:
      - `docs`记录 doc id
      - `freqs`记录 doc id / term frequencies
      - `positions`记录  doc id / term frequencies / term position
      - `offects`记录  doc id / term frequencies / term position / character offects 
    * Text类型默认记录`positions`,其他默认为 `docs`。记录的类容越多,占用存储空间越大。
    * ` null_value`控制需要对Null值实现搜索;只有Keyword类型支持设定null_value。
    * ` copy_to`满足一些特定的搜索需求,` copy_to`将字段的数值拷贝到目标字段,实现类似`_all`的作用,`_all`在ES7中被` copy_to`所替代,` copy_to`的目标字段不出现在_source中。
    *  Elasticsearch中不提供专门的数组类型。但是任何字段,都可以包含多个相同类类型的数值。
    
    

    1.设置 index 为 false

    DELETE users
    PUT users
    {
    "mappings" : {
    "properties" : {
    "firstName" : {
    "type" : "text"
    },
    "lastName" : {
    "type" : "text"
    },
    "mobile" : {
    "type" : "text",
    "index": false
    }
    }
    }
    }

    插入数据

    PUT users/_doc/1
    {
    "firstName":"Ruan",
    "lastName": "Yiming",
    "mobile": "12345678"
    }

    查询

    POST /users/_search
    {
    "query": {
    "match": {
    "mobile":"12345678" #该字段不可被搜索
    }
    }
    }

    查询返回结果:

    {
    "error": {
    "root_cause": [
    {
    "type": "query_shard_exception",
    "reason": "failed to create query: { "match" : { "mobile" : { "query" : "12345678", "operator" : "OR", "prefix_length" : 0, "max_expansions" : 50, "fuzzy_transpositions" : true, "lenient" : false, "zero_terms_query" : "NONE", "auto_generate_synonyms_phrase_query" : true, "boost" : 1.0 } } }",
    "index_uuid": "1oB9dwY2TPq-9QjiaMaU7g",
    "index": "users"
    }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
    {
    "shard": 0,
    "index": "users",
    "node": "u-4S1mfbQiuA1Bqe-wfPJQ",
    "reason": {
    "type": "query_shard_exception",
    "reason": "failed to create query: { "match" : { "mobile" : { "query" : "12345678", "operator" : "OR", "prefix_length" : 0, "max_expansions" : 50, "fuzzy_transpositions" : true, "lenient" : false, "zero_terms_query" : "NONE", "auto_generate_synonyms_phrase_query" : true, "boost" : 1.0 } } }",
    "index_uuid": "1oB9dwY2TPq-9QjiaMaU7g",
    "index": "users",
    "caused_by": {
    "type": "illegal_argument_exception",
    "reason": "Cannot search on field [mobile] since it is not indexed." #错误原因
    }
    }
    }
    ]
    },
    "status": 400
    }

    
    

    设定Null_value

    DELETE users
    PUT users
    {
    "mappings" : {
    "properties" : {
    "firstName" : {
    "type" : "text"
    },
    "lastName" : {
    "type" : "text"
    },
    "mobile" : {
    "type" : "keyword",
    "null_value": "NULL"
    }

      }
    }
    

    }

    插入数据

    PUT users/_doc/1
    {
    "firstName":"Ruan",
    "lastName": "Yiming",
    "mobile": null
    }

    插入数据

    PUT users/_doc/2
    {
    "firstName":"Ruan2",
    "lastName": "Yiming2"

    }

    查询

    GET users/_search
    {
    "query": {
    "match": {
    "mobile":"NULL"
    }
    }
    }

    查询返回结果:

    {
    "took" : 1,
    "timed_out" : false,
    "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
    },
    "hits" : {
    "total" : {
    "value" : 1,
    "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
    {
    "_index" : "users",
    "_type" : "_doc",
    "_id" : "1",
    "_score" : 0.2876821,
    "_source" : {
    "firstName" : "Ruan",
    "lastName" : "Yiming",
    "mobile" : null
    }
    }
    ]
    }
    }

    设置 Copy to

    DELETE users
    PUT users
    {
    "mappings": {
    "properties": {
    "firstName":{
    "type": "text",
    "copy_to": "fullName"
    },
    "lastName":{
    "type": "text",
    "copy_to": "fullName"
    }
    }
    }
    }

    插入数据

    PUT users/_doc/1
    {
    "firstName":"Ruan",
    "lastName": "Yiming"
    }

    查询方法1

    GET users/_search?q=fullName:(Ruan Yiming)

    查询方法2

    POST users/_search
    {
    "query": {
    "match": {
    "fullName":{
    "query": "Ruan Yiming",
    "operator": "and"
    }
    }
    }
    }

    查询返回结果:

    {
    "took" : 1,
    "timed_out" : false,
    "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
    },
    "hits" : {
    "total" : {
    "value" : 1,
    "relation" : "eq"
    },
    "max_score" : 0.5753642,
    "hits" : [
    {
    "_index" : "users",
    "_type" : "_doc",
    "_id" : "1",
    "_score" : 0.5753642,
    "_source" : {
    "firstName" : "Ruan",
    "lastName" : "Yiming"
    }
    }
    ]
    }
    }

    
    

    数组类型

    PUT users/_doc/1
    {
    "name":"twobirds",
    "interests":["reading","music"]
    }
    GET users/_mapping

    返回Mapping结果:

    {
    "users" : {
    "mappings" : {
    "properties" : {
    "firstName" : {
    "type" : "text",
    "copy_to" : [
    "fullName"
    ]
    },
    "fullName" : {
    "type" : "text",
    "fields" : {
    "keyword" : {
    "type" : "keyword",
    "ignore_above" : 256
    }
    }
    },
    "interests" : {
    "type" : "text", #数组类型,根据数组里数据类型配置
    "fields" : {
    "keyword" : {
    "type" : "keyword",
    "ignore_above" : 256
    }
    }
    },
    "lastName" : {
    "type" : "text",
    "copy_to" : [
    "fullName"
    ]
    },
    "name" : {
    "type" : "text",
    "fields" : {
    "keyword" : {
    "type" : "keyword",
    "ignore_above" : 256
    }
    }
    }
    }
    }
    }
    }

    
    
    <br/>


    作者:牧汜
    出处:http://www.cnblogs.com/czbxdd/
    本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

  • 相关阅读:
    四则运算
    3.12----对potplayer的使用评价
    对软件工程的一点思考
    个人附加作业
    附加题
    个人最终总结
    结对编程总结
    修改后的四则运算
    阅读程序回答问题
    Visual studio 2013的安装和单元测试
  • 原文地址:https://www.cnblogs.com/czbxdd/p/11679230.html
Copyright © 2011-2022 走看看