1、scroll及其步骤简单说明
如果一次性要查出来比如10万条数据,那么性能会很差,此时一般会采取用scroll滚动查询,一批一批的查,直到所有数据都查询完为止。
1)scroll搜索会在第一次搜索的时候,保存一个当时的视图快照,之后只会基于该旧的视图快照提供数据搜索,如果这个期间数据变更,是不会让用户看到的。
2)采用基于_doc(不使用_score)进行排序的方式,性能较高。(默认是基于_score的相关度由高到低排序查询)。
3)每次发送scroll请求,我们还需指定一个scoll参数,指定一个时间窗口,每次搜索请求只要在这个时间窗口内能完成就可以了。
2、操作演示
准备数据
PUT /lib { "settings":{ "number_of_shards":3, "number_of_replicas":0 }, "mappings":{ "user":{ "properties":{ "name":{"type":"text"}, "address":{"type":"text"}, "age":{"type":"integer"}, "interests":{"type":"text"}, "birthday":{"type":"date"} } } } }
put /lib/user/1 { "name":"zhaoliu", "address":"hei long jiang sheng tie ling shi", "age":50, "birthday":"1970-12-12", "interests":"xi huang hejiu,duanlian,lvyou" } put /lib/user/2 { "name":"zhaoming", "address":"bei jing hai dian qu qing he zhen", "age":20, "birthday":"1998-10-12", "interests":"xi huan hejiu,duanlian,changge" } put /lib/user/3 { "name":"lisi", "address":"bei jing hai dian qu qing he zhen", "age":23, "birthday":"1998-10-12", "interests":"xi huan hejiu,duanlian,changge" } put /lib/user/4 { "name":"wangwu", "address":"bei jing hai dian qu qing he zhen", "age":26, "birthday":"1998-10-12", "interests":"xi huan biancheng,tingyinyue,lvyou" } put /lib/user/5 { "name":"zhangsan", "address":"bei jing chao yang qu", "age":29, "birthday":"1988-10-12", "interests":"xi huan tingyinyue,changge,tiaowu" }
执行下面的语句:一次查询出3条,在1分钟内完成即可
GET lib/user/_search?scroll=1m { "query": { "match_all": {} }, "sort":["_doc"], "size":3 }
查询出了id为1、2、4的文档,返回了一个_scroll_id
{ "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAAfFkFKM3g2eWM4VGZLajZfeng2VlJtMGcAAAAAAAAAIBZBSjN4NnljOFRmS2o2X3p4NlZSbTBnAAAAAAAAACEWQUozeDZ5YzhUZktqNl96eDZWUm0wZw==", "took": 1, "timed_out": false, "_shards": { "total": 3, "successful": 3, "skipped": 0, "failed": 0 }, "hits": { "total": 5, "max_score": null, "hits": [ { "_index": "lib", "_type": "user", "_id": "2", "_score": null, "_source": { "name": "zhaoming", "address": "bei jing hai dian qu qing he zhen", "age": 20, "birthday": "1998-10-12", "interests": "xi huan hejiu,duanlian,changge" }, "sort": [ 0 ] }, { "_index": "lib", "_type": "user", "_id": "1", "_score": null, "_source": { "name": "zhaoliu", "address": "hei long jiang sheng tie ling shi", "age": 50, "birthday": "1970-12-12", "interests": "xi huang hejiu,duanlian,lvyou" }, "sort": [ 0 ] }, { "_index": "lib", "_type": "user", "_id": "4", "_score": null, "_source": { "name": "wangwu", "address": "bei jing hai dian qu qing he zhen", "age": 26, "birthday": "1998-10-12", "interests": "xi huan biancheng,tingyinyue,lvyou" }, "sort": [ 1 ] } ] } }
继续执行查询,scroll_id的值是上面的那个返回的_scroll_id的值
GET /_search/scroll { "scroll":"1m", "scroll_id":"DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAAfFkFKM3g2eWM4VGZLajZfeng2VlJtMGcAAAAAAAAAIBZBSjN4NnljOFRmS2o2X3p4NlZSbTBnAAAAAAAAACEWQUozeDZ5YzhUZktqNl96eDZWUm0wZw==" }
查询结果,返回了id是3、5的文档,如果后面还有文档,只需改变_scroll_id的值,继续执行即可。
{ "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAAfFkFKM3g2eWM4VGZLajZfeng2VlJtMGcAAAAAAAAAIBZBSjN4NnljOFRmS2o2X3p4NlZSbTBnAAAAAAAAACEWQUozeDZ5YzhUZktqNl96eDZWUm0wZw==", "took": 2, "timed_out": false, "_shards": { "total": 3, "successful": 3, "skipped": 0, "failed": 0 }, "hits": { "total": 5, "max_score": null, "hits": [ { "_index": "lib", "_type": "user", "_id": "3", "_score": null, "_source": { "name": "lisi", "address": "bei jing hai dian qu qing he zhen", "age": 23, "birthday": "1998-10-12", "interests": "xi huan hejiu,duanlian,changge" }, "sort": [ 1 ] }, { "_index": "lib", "_type": "user", "_id": "5", "_score": null, "_source": { "name": "zhangsan", "address": "bei jing chao yang qu", "age": 29, "birthday": "1988-10-12", "interests": "xi huan tingyinyue,changge,tiaowu" }, "sort": [ 2 ] } ] } }