如果不了解Es的基本使用,可以查看之前的文章。Elasticsearch 索引及文档的基本操作
在查询之前可以使用Bulk API 批量插入文档数据 数据来源
查询数据
match query
match会使用分词器解析!先分析文档,然后再通过分析的文档进行查询。
GET /student/_search
{
"query": {
"match": {
"name": "山西"
}
}
}
上面的搜索也可以这么实现
GET /student/_search?q=name:"山西"
查询结果展示有三个名字中包含 “山西” 的学生:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.7133499,
"hits" : [
{
"_index" : "student",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.7133499,
"_source" : {
"name" : "山西太原-张三",
"age" : "23",
"address" : {
"city" : "太原",
"province" : "山西"
}
}
},
{
"_index" : "student",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.7133499,
"_source" : {
"name" : "山西长治-李四",
"age" : "24",
"address" : {
"city" : "长治",
"province" : "山西"
}
}
},
{
"_index" : "student",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.7133499,
"_source" : {
"name" : "山西吕梁-王五",
"age" : "25",
"address" : {
"city" : "吕梁",
"province" : "山西"
}
}
}
]
}
}
描述
query
: 表示查询。
match
: 要匹配的条件信息。
name
:要查询的信息
hits --> total
value
: 查询出两条数据ralation
: 关系是 eq,相等
max_source
: 最大分值
hits
: 索引和文档的信息,查询出来的结果总数,就是查询出来的具体文档。
我们可以根据每个文档的 _source
来判断那条数据更加符合预期结果。
在使用mutch查询时,默认的操作是 OR,下面两个查询的结果是相同的:
GET student/_search
{
"query": {
"match": {
"name": {
"query": "山西长治",
"operator": "or"
}
}
}
}
GET student/_search
{
"query": {
"match": {
"name": "山西长治"
}
}
}
因为在使用mutch操作时,operator 默认值为 OR,上面的查询为只要任何文档匹配 :山西长治 其中任何一个字将被显示。
可以通过设置 minimum_should_match 参数来设置至少匹配的term,比如:
GET student/_search
{
"query": {
"match": {
"name": {
"query": "山西长治",
"operator": "or",
"minimum_should_match": 3
}
}
}
}
只有匹配到 山西长治 这四个字其中的三个字的文档才会被显示。
改为 and 之后,只有一个文档会被查询到:
GET student/_search
{
"query": {
"match": {
"name": {
"query": "山西长治",
"operator": "and"
}
}
}
}
Ids query
使用多个id批量查询文档
GET student/_search
{
"query": {
"ids": {
"values": [1,2,3]
}
}
}
上面的查询将返回 id 为 1,2,3的文档。
multi_match
multi_match 查询建立在 match 查询的基础上,允许多字段查询。
在上面的搜索中,通过指定一个 field 来进行搜索。在很多情况下,并不知道那个 field 含有要查询的关键字,这种情况就可以使用 multi_match 来查询。
GET student/_search
{
"query": {
"multi_match": {
"query": "山西长治",
"fields": [
"name",
"address.city^3",
"address.province"
],
"type": "best_fields"
}
}
}
将field:name、city、province 进行检索,并对 city 中含有 山西长治 的文档的分数进行三倍加权。返回结果为:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 7.223837,
"hits" : [
{
"_index" : "student",
"_type" : "_doc",
"_id" : "2",
"_score" : 7.223837,
"_source" : {
"name" : "山西长治-李四",
"age" : "24",
"address" : {
"city" : "长治",
"province" : "山西"
}
}
},
{
"_index" : "student",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.7133499,
"_source" : {
"name" : "山西太原-张三",
"age" : "23",
"address" : {
"city" : "太原",
"province" : "山西"
}
}
},
{
"_index" : "student",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.7133499,
"_source" : {
"name" : "山西吕梁-王五",
"age" : "25",
"address" : {
"city" : "吕梁",
"province" : "山西"
}
}
}
]
}
}
Prefix query
返回在提供的字段中返回包含特定前缀的文档
GET student/_search
{
"query": {
"prefix": {
"address.city": {
"value": "吕"
}
}
}
}
查询城市开头为 吕 的文档
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "student",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "山西吕梁-王五",
"age" : "25",
"address" : {
"city" : "吕梁",
"province" : "山西"
}
}
}
]
}
}
Term query
term 会在给定字段中进行精确的字段匹配,因此需要提供准确的查询条件以获取正确的结果
GET /student/_search
{
"query": {
"term": {
"name.keyword": "山西太原-张三"
}
}
}
这里使用 name.keyword
来对 "山西太原-张三" 这个条件进行精确查询匹配文档:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.2039728,
"hits" : [
{
"_index" : "student",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.2039728,
"_source" : {
"name" : "山西太原-张三",
"age" : "23",
"address" : {
"city" : "太原",
"province" : "山西"
}
}
}
]
}
}
Terms query
如果想用对个值进行精确查询,可以使用terms进行查询。类似于 SQL中的 in 语法
GET student/_search
{
"query": {
"terms": {
"address.city.keyword": [
"长治",
"广州"
]
}
}
}
上面的查询结果将展示 address.city.keyword 里含有 长治和广州 的所有文档。
复合查询
复合查询是将上面的单个查询组合起来形成更复杂的查询。
一般格式为:
POST _search
{
"query": {
"bool" : {
"must" : {
"term" : { "user" : "kimchy" }
},
"filter": {
"term" : { "tag" : "tech" }
},
"must_not" : {
"range" : {
"age" : { "gte" : 10, "lte" : 20 }
}
},
"should" : [
{ "term" : { "tag" : "wow" } },
{ "term" : { "tag" : "elasticsearch" } }
],
"minimum_should_match" : 1,
"boost" : 1.0
}
}
}
复合查询是由 bool 下面的 must
filter
must_not
should
组成,并且可以通过 minimum_should_match
来指定文档必须匹配的数量或者百分比。如果布尔查询包含至少一个 should 子句,并且没有 must 或 filter 子句,则默认值为1。否则,默认值为0。
must
must 相当于SQL中的 and 操作。
使用复合查询城市为长治,年龄为24的文档数据
GET student/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"address.city": "长治"
}
},
{
"match": {
"age": "24"
}
}
]
}
}
}
must_not
查询所有省份不在山西的文档,返回结果只剩下了一个广州:
GET student/_search
{
"query": {
"bool": {
"must_not": [
{
"match": {
"address.province": "山西"
}
}
]
}
}
}
filter
使用filter过滤年龄在24~25之间的文档
GET student/_search
{
"query": {
"bool": {
"filter": [
{
"range": {
"age": {
"gte": 24,
"lte": 25
}
}
}
]
}
}
}
gt
: 大于gte
: 大于等于lt
:小于lte
:小于等于
should
should 表示或的意思,相当于SQL中的 OR。
查询省份在山西的文档,如果name含有张三,相关性会更高,搜索结果会靠前。
GET student/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"address.province": "山西"
}
}
],
"should": [
{
"match_phrase": {
"name": "李四"
}
}
]
}
}
}
返回结果可以看到 name为 山西长治-李四 的文档排在最前:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 3.1212955,
"hits" : [
{
"_index" : "student",
"_type" : "_doc",
"_id" : "2",
"_score" : 3.1212955,
"_source" : {
"name" : "山西长治-李四",
"age" : "24",
"address" : {
"city" : "长治",
"province" : "山西"
}
}
},
{
"_index" : "student",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.7133499,
"_source" : {
"name" : "山西太原-张三",
"age" : "23",
"address" : {
"city" : "太原",
"province" : "山西"
}
}
},
{
"_index" : "student",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.7133499,
"_source" : {
"name" : "山西吕梁-王五",
"age" : "25",
"address" : {
"city" : "吕梁",
"province" : "山西"
}
}
}
]
}
}
通配符查询
使用 wildcard 查询一个字符串中包含的字符,相当于SQL中的 like
GET student/_search
{
"query": {
"wildcard": {
"name": {
"value": "*王"
}
}
}
}
查询结果为:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "student",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "山西吕梁-王五",
"age" : "25",
"address" : {
"city" : "吕梁",
"province" : "山西"
}
}
}
]
}
}
分页及排序
查询省份为山西的文档,按照年龄倒序排列并分页展示
GET student/_search
{
"query": {
"match": {
"address.province": "山西"
}
},
"sort": [
{
"age.keyword": {
"order": "desc"
}
}
],
"from": 2,
"size": 2
}
from
: 起始页,下标从0开始。
size
: 每页显示多少条
高亮查询
使用 highlight
高亮查询并且自定义高亮字段。并通过 pre_tags
和 post_tags
修改高亮文本前后缀。
GET student/_search
{
"query": {
"match": {
"name": "张三"
}
},
"highlight": {
"pre_tags": "<br>",
"post_tags": "</br>",
"fields": {
"name": {}
}
}
}
返回结果
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 2.4079456,
"hits" : [
{
"_index" : "student",
"_type" : "_doc",
"_id" : "1",
"_score" : 2.4079456,
"_source" : {
"name" : "山西太原-张三",
"age" : 23,
"address" : {
"city" : "太原",
"province" : "山西"
}
},
"highlight" : {
"name" : [
"山西太原-<br>张</br><br>三</br>"
]
}
}
]
}
}