zoukankan      html  css  js  c++  java
  • elasticsearch-Mapping

    什么是映射

    类似于数据库中的表结构定义,主要作用如下:

    • 定义Index下字段名(Field Name)
    • 定义字段的类型,比如数值型,字符串型、布尔型等
    • 定义倒排索引的相关配置,比如是否索引、记录postion等

    需要注意的是,在索引中定义太多字段可能会导致索引膨胀,出现内存不足和难以恢复的情况,下面有几个设置:

    • index.mapping.total_fields.limit:一个索引中能定义的字段的最大数量,默认是 1000
    • index.mapping.depth.limit:字段的最大深度,以内部对象的数量来计算,默认是20
    • index.mapping.nested_fields.limit:索引中嵌套字段的最大数量,默认是50

    Mapping的数据类型

    详见:https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

    基本数据类型

    属性名字 说明
    text

    用于全文索引,该类型的字段将通过分词器进行分词,最终用于构建索引

    keyword 不分词
    long 有符号64-bit integer:-2^63 ~ 2^63 - 1
    integer 有符号32-bit integer,-2^31 ~ 2^31 - 1
    short 有符号16-bit integer,-32768 ~ 32767
    byte  有符号8-bit integer,-128 ~ 127
    double 64-bit IEEE 754 浮点数
    float 32-bit IEEE 754 浮点数
    half_float 16-bit IEEE 754 浮点数
    boolean true,false
    date https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html
    binary

    该类型的字段把值当做经过 base64 编码的字符串,默认不存储,且不可搜索

    Mapping范围数据类型

    标识一个数据范围而不是一个值  如age:10~20   搜索{"gle":5,"lte":20} 则可以搜索出来数据

    支持的数据类型 说明

    integer_range

     

    float_range

     

    long_range

     

    double_range

     

    date_range

    64-bit 无符号整数,时间戳(单位:毫秒)

    ip_range

    IPV4 或 IPV6 格式的字符串

    可选参数:

    relation这只匹配模式

    INTERSECTS 默认的匹配模式,只要搜索值与字段值有交集即可匹配到

    WITHIN 字段值需要完全包含在搜索值之内,也就是字段值是搜索值的子集才搜索出来

    CONTAINS 与WITHIN相反,只搜索字段值包含搜索值的文档

    测试

    1.添加index

    put:127.0.0.1:9200/range_test

    {
      "mappings": {
        "_doc": {
          "properties": {
            "count": {
              "type": "integer_range"
            },
            "create_date": {
              "type": "date_range", 
              "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
            }
          }
        }
      }
    }

    2.添加测试数据

    post:127.0.0.1:9200/range_test/_doc/1

    {
      "count" : { 
        "gte" : 1,
        "lte" : 100
      },
      "create_date" : { 
        "gte" : "2019-02-1 12:00:00", 
        "lte" : "2019-03-30"
      }
    }

    3.测试搜索

    get:127.0.0.1:9200/range_test/_doc/_search

    {
        "query":{
            "term":{
                "count":5
            }
        }
    }

    5在1~100之间可以搜索出来

    {
      "query" : {
        "range" : {
          "create_date" : { 
            "gte" : "2019-02-01",
            "lte" : "2019-03-30",
            "relation" : "within" 
          }
        }
      }
    }

    Mapping复杂数据类型

    数组类型 Array

    支持字符串 数值 object对象数组   数组元素必须为相同数据类型

    对象类型 Object

    {
        "name": "小明",
        "user_info": {
            "student_id": 111,
            "class_info": {
                "class_name": "1年级"
            }
        }
    }

    被索引形式

    {
     "name":"小明",
    "user_info.student_id":"111",
    "user_info.student_info.class_name":"111"
    }

    嵌套类型 Nested

    能够支持数组元素单独的做索引

    查询api:https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html

    聚合api:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-reverse-nested-aggregation.html

    排序api:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-reverse-nested-aggregation.html

    检索和高亮:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-inner-hits.html#nested-inner-hits

    Nested和Object区别

    put:127.0.0.1:9200/object_test/_doc/1 默认是object类型

    {
        "user_name":"小明",
        "subjects":[
            {"subject_name":"地理","id":1},
            {"subject_name":"英语","id":2}
        ]
    }

    搜索名字为英语id为1的

    {
        "query":{
            "bool":{
            "must":[
                {"match":{"subjects.subject_name":"英语"}},
                    {"match":{"subjects.id":"1"}}
                ]
                }
        }
    }

    正常搜索不出来  测试时搜索出来了

    因为索引为以下格式

    {
     "name":"小明",
    "subjects.subject_name":["英语","地理"],
    "subjects.subject_id":["1","2"]
    }

    改为Nested 就不会

    地理数据类型

    geo_point

    几种格式

    object对象:"location": {"lat": 41.12, "lon": -71.34}

    字符串:"location": "41.12,-71.34"

    geohash:"location": "drm3btev3e86"

    数组:"location": [ -71.34, 41.12 ]

    geo_shape

    查询api:https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-geo-bounding-box-query.html

    专用数据类型

    • 记录IP地址 ip
    • 实现自动补全 completion
    • 记录分词数 token_count
    • 记录字符串hash值 murmur3
    • Percolator

    Mapping设置

    一个完整的mapping设置 

      {
          "settings": {
        "analysis": {
          "analyzer": {
            "ik_pinyin_analyzer": {
              "type": "custom",
              "tokenizer": "ik_smart",
              "filter": ["my_pinyin"]#自定义filter
            },
            "pinyin_analyzer": {
              "tokenizer": "shopmall_pinyin"
            },
            "first_py_letter_analyzer": {
              "tokenizer": "first_py_letter"
            },
            "full_pinyin_letter_analyzer": {
              "tokenizer": "full_pinyin_letter"
            },
            "onlyOne_analyzer": {
              "tokenizer": "onlyOne_pinyin"
            }
          },
          "tokenizer": {#自定义分词器 
            "onlyOne_pinyin": {
              "type":"pinyin",
              "keep_separate_first_letter": "false",
              "keep_first_letter":"false"
            },
            "shopmall_pinyin": {
              "keep_joined_full_pinyin": "true",
              "keep_first_letter": "true",
              "keep_separate_first_letter": "false",
              "lowercase": "true",
              "type": "pinyin",
              "limit_first_letter_length": "16",
              "keep_original": "true",
              "keep_full_pinyin": "true",
              "keep_none_chinese_in_joined_full_pinyin": "true"
            },
            "first_py_letter": {
              "type": "pinyin",
              "keep_first_letter": true,
              "keep_full_pinyin": false,
              "keep_original": false,
              "limit_first_letter_length": 16,
              "lowercase": true,
              "trim_whitespace": true,
              "keep_none_chinese_in_first_letter": false,
              "none_chinese_pinyin_tokenize": false,
              "keep_none_chinese": true,
              "keep_none_chinese_in_joined_full_pinyin": true
            },
            "full_pinyin_letter": {
              "type": "pinyin",
              "keep_separate_first_letter": false,
              "keep_full_pinyin": false,
              "keep_original": false,
              "limit_first_letter_length": 16,
              "lowercase": true,
              "keep_first_letter": false,
              "keep_none_chinese_in_first_letter": false,
              "none_chinese_pinyin_tokenize": false,
              "keep_none_chinese": true,
              "keep_joined_full_pinyin": true,
              "keep_none_chinese_in_joined_full_pinyin": true
            }
          },
          "filter": {
            "my_pinyin": {
              "type": "pinyin",
              "keep_joined_full_pinyin": true,
              "keep_separate_first_letter":true
            }
          }
        }
    
      },
          "mappings": {
        "doc": {#type名字
          "properties": {#mapping的属性
            "productName": {属性名字
              "type": "text",#属性类型
              "analyzer": "ik_pinyin_analyzer",#分词器
              "fields": {#fields 指定自定义分词器 查询时通过productName.keyword_once_pinyin 可以指定
                "keyword_once_pinyin": {
                  "type": "text",
                  "analyzer": "onlyOne_analyzer"#指定的自定义分词器
                }
              }
            },
            "skuNames": {
              "type": "text",
              "analyzer": "ik_pinyin_analyzer",
              "fields": {
                "keyword_once_pinyin": {
                  "type": "text",
                  "analyzer": "onlyOne_analyzer"
                }
              }
            },
            "regionCode": {
              "type": "keyword"
            },
            "productNameSuggester": {#es6.x搜索建议实现
              "type": "completion",
              "fields": {
                "pinyin": {
                  "type": "completion",
                  "analyzer": "pinyin_analyzer"
                },
                "keyword_pinyin": {
                  "type": "completion",
                  "analyzer": "full_pinyin_letter_analyzer"
                },
                "keyword_first_py": {
                  "type": "completion",
                  "analyzer": "first_py_letter_analyzer"
                }
              }
            }
              "info": {#es6父子类型设置
              "type": "join",
              "relations": {
                "md_product":[ "sl_customer_character_order_list","ic_product_store_account","sl_customer_product_setting"]
              }
            }
          }
        }
      }
      }
    View Code

    创建mapping

    put:http://127.0.0.1:9200/db

    {
        "mappings": {
            "product": {//type
                "properties": {
                    "productName": {//字段
                        "type": "text"//数据类型
                    }
                }
            }
        }
    }

    mapping参数

    参数 说明
    analyzer 分词器 默认:standard
    boost 字段权重默认1 在通过_all字段查询 根据此字段来权重
    dynamic 控制字段新增 true(默认 允许新增) false  strict 不能新增文档
    index 控制字段是否索引(可搜索) true 是 false否 可以节省 磁盘空间

    参考:https://www.jianshu.com/p/e8a9feea683c

    新增mapping字段

    Elasticsearch的mapping一旦创建,只能增加字段,而不能修改已经mapping的字段

    put http://127.0.0.1:9200/{indexName}/_mapping/{typeName}
    
    {
      "properties": {
        "productSortItemIds": { #字段名字
         "type": "string",#类型
          "store": true, #是否持久化
          "analyzer": "comma", #分词器
          "search_analyzer": "comma" #搜索分词器
        }
      }
    }

    修改字段类型

    es不支持修改字段类型,因为es是根据lucene生成倒排索引 一旦生成不允许修改,如果需要修改 需要使用reindex重建索引 如果修改了类型会导致无法搜索,所以不支持,新增字段不受影响

    新增分词器

     post /{index}/_close #关闭索引
     put  /bbc_product/_settings #增加,号分词器
    {
      "settings": {
        "analysis": {
          "analyzer": {
              "comma": {
                     "type": "pattern",
                     "pattern":","
             }
          }
        }
      }
    }
    post /{index}/_open #开启索引

    查看当前索引的映射

    http://127.0.0.1:9200/blogs2/product/_mapping 不加/_mapping可看整个index设置

    {
        "blogs2": {
            "mappings": {
                "product": {
                    "properties": {
                        "price": {
                            "type": "long"
                        },
                        "productName": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "remark": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "tags": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        }
                    }
                }
            }
        }
    }

    自定义映射

    作用定义数据类型 比如数字映射成text 大于小于范围搜索就会无效 还有明确哪些fullText需要分词哪些不需要分词

    确切值(Exact values)和全文本(FullText)

    es支持很多种数据类型但是主要分为2大类
    确切值就是能够确定的值 比如id 日期  通过=就能查询到我们想要的数据

    而全文本是需要进行相似度匹配 返回最佳匹配

    Template Mapping

    用于设置一个模板,避免新建索引 漏设置一些核心参数,当创建maping会自动将模板索引的设置根据优先级设置到创建的mapping

    优先级 当前设置>模板maping order由小到大>默认

    #数字字符串被映射成text,日期字符串被映射成日期
    PUT ttemplate/_doc/1
    {
        "someNumber":"1",
        "someDate":"2019/01/01"
    }
    GET ttemplate/_mapping
    
    
    #Create a default template
    PUT _template/template_default
    {
      "index_patterns": ["*"],
      "order" : 0,
      "version": 1,
      "settings": {
        "number_of_shards": 1,
        "number_of_replicas":1
      }
    }
    
    
    PUT /_template/template_test
    {
        "index_patterns" : ["test*"],
        "order" : 1,
        "settings" : {
            "number_of_shards": 1,
            "number_of_replicas" : 2
        },
        "mappings" : {
            "date_detection": false,
            "numeric_detection": true
        }
    }
    
    #查看template信息
    GET /_template/template_default
    GET /_template/temp*
    
    
    #写入新的数据,index以test开头
    PUT testtemplate/_doc/1
    {
        "someNumber":"1",
        "someDate":"2019/01/01"
    }
    GET testtemplate/_mapping
    get testtemplate/_settings
    
    PUT testmy
    {
        "settings":{
            "number_of_replicas":5
        }
    }
    
    put testmy/_doc/1
    {
      "key":"value"
    }
    
    get testmy/_settings
    DELETE testmy
    DELETE /_template/template_default
    DELETE /_template/template_test
    
    
    
    #Dynaminc Mapping 根据类型和字段名
    DELETE my_index
    
    PUT my_index/_doc/1
    {
      "firstName":"Ruan",
      "isVIP":"true"
    }
    
    GET my_index/_mapping
    DELETE my_index
    PUT my_index
    {
      "mappings": {
        "dynamic_templates": [
                {
            "strings_as_boolean": {
              "match_mapping_type":   "string",
              "match":"is*",
              "mapping": {
                "type": "boolean"
              }
            }
          },
          {
            "strings_as_keywords": {
              "match_mapping_type":   "string",
              "mapping": {
                "type": "keyword"
              }
            }
          }
        ]
      }
    }
    
    
    DELETE my_index
    #结合路径
    PUT my_index
    {
      "mappings": {
        "dynamic_templates": [
          {
            "full_name": {
              "path_match":   "name.*",
              "path_unmatch": "*.middle",
              "mapping": {
                "type":       "text",
                "copy_to":    "full_name"
              }
            }
          }
        ]
      }
    }
    
    
    PUT my_index/_doc/1
    {
      "name": {
        "first":  "John",
        "middle": "Winston",
        "last":   "Lennon"
      }
    }
    
    GET my_index/_search?q=full_name:John
    View Code
  • 相关阅读:
    原型,原型对象,原型链,构造函数,继承(一)
    暑假闲着没事第一弹:基于Django的长江大学教务处成绩查询系统
    ANDROID自定义视图——onMeasure流程,MeasureSpec详解
    android 中发送短信
    VelocityTracker简介
    Android xml资源文件中@、@android:type、@*、?、@+含义和区别
    android:id="@+id/android:empty属性的用法举例
    Android ProgressBar详解以及自定义
    Android自定义进度条
    布局技巧:使用ViewStub
  • 原文地址:https://www.cnblogs.com/LQBlog/p/10648496.html
Copyright © 2011-2022 走看看