Elasticsearch分片、副本与路由(shard replica routing)

zoukankan html css js c++ java

Elasticsearch分片、副本与路由(shard replica routing)
本文讲述，如何理解Elasticsearch的分片、副本和路由策略。

1、预备知识

1）分片（shard）

Elasticsearch集群允许系统存储的数据量超过单机容量，实现这一目标引入分片策略shard。在一个索引index中，数据（document）被分片处理（sharding）到多个分片上。Elasticsearch屏蔽了管理分片的复杂性，使得多个分片呈现出一个大索引的样子。

2）副本（replica）

为了提升访问压力过大是单机无法处理所有请求的问题，Elasticsearch集群引入了副本策略replica。副本策略对index中的每个分片创建冗余的副本，处理查询时可以把这些副本当做主分片来对待（primary shard），此外副本策略提供了高可用和数据安全的保障，当分片所在的机器宕机，Elasticsearch可以使用其副本进行恢复，从而避免数据丢失。

3）路由（routing）

当向Elasticsearch存放数据时，根据文档标识符_id将文档分配到多个分片上，负载均衡算法只需要实现平均即可。当取用数据时，查询所有的分片然后汇总结果，而并不必须知道数据到底存在哪个分片上。带来的问题是，在查询时，要查询所有的分片然后汇总结果，造成性能的损耗，在不乐观的情况下，有些分片的查询可能失败(failed)，造成结果不准确。为了避免这个问题，引入了路由功能（routing），在存入时通过路由键将数据存入指定分片，在查询的时候可以通过相同的路由键指明在哪个分片将数据查出来。

默认情况下，索引数据的分片算法如下
shard_num = hash(_routing) % num_primary_shards
routing字段的取值，默认是_id字段或者是_parent字段，这样的取值在hash之后再与有多少个shard的数量取模，最终得到这条数据应该在被分配在那个一个shard上，也就是说默认是基于hash的分片，保证在每个shard上数据量都近似平均，这样就不会出现负载不均衡的情况，然后在检索的时候，es默认会搜索所有shard上的数据，最后在master节点上汇聚在处理后，返回最终数据。

假设你有一个100个分片的索引。当一个请求在集群上执行时会发生什么呢？

1. 这个搜索的请求会被发送到一个节点

2. 接收到这个请求的节点，将这个查询广播到这个索引的每个分片上（可能是主分片，也可能是复制分片）

3. 每个分片执行这个搜索查询并返回结果

4. 结果在通道节点上合并、排序并返回给用户

2、分片(shard)与副本(replica)的数量

ElasticSearch在创建索引数据时，最好指定相关的shards数量和replicas，否则会使用服务器中的默认配置参数shards=5，replicas=1。
index.number_of_shards: 5 index.number_of_replicas: 1
对于一个索引来说，number_of_shards只能设置一次，而number_of_replicas可以使用索引更新设置API在任何时候被增加或者减少。

那么如何确定分片和副本的数量呢？

依照经验，最理想的分片数量应该依赖于节点的数量。假设索引index配置了10个分片，1个副本，那么总共的分片数应该是20个，10 *（1+1），那么最大的Elasticsearch节点数应该就是20。
节点最大数 = 分片数 * （副本数 + 1）
3、路由功能

1）安装Paramedic插件

Elasticsearch提供了很多插件化功能，Paramedic可以直观的查看Elasticsearch对数据的分片和副本。
[bigdata-dw@bigdata-arch-client10 es2.1.1]$ ./bin/plugin install karmi/elasticsearch-paramedic -> Installing karmi/elasticsearch-paramedic... Trying https://github.com/karmi/elasticsearch-paramedic/archive/master.zip ... Downloading ............................................................................................................................................................................................................................................DONE Verifying https://github.com/karmi/elasticsearch-paramedic/archive/master.zip checksums if available ... NOTE: Unable to verify checksum for downloaded plugin (unable to find .sha1 or .md5 file to verify) Installed paramedic into /home/bigdata-dw/es2.1.1/plugins/paramedic
2）创建索引documents

创建ducuments索引，包含3个分片，1个副本。
[bigdata-dw@bigdata-arch-client10 es2.1.1]$ curl -XPUT http://10.93.21.21:8049/documents -d '{ > settings: { > number_of_replicas: 1, > number_of_shards: 3 > } > }' {"acknowledged":true}
3）在索引数据的过程中使用路由

我们创建3个Document

id=1
curl -XPUT http://10.93.21.21:8049/documents/doc/1?routing=A -d '{"title": "Document"}' {"_index":"documents","_type":"doc","_id":"1","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}
id=2
curl -XPUT http://10.93.21.21:8049/documents/doc/2?routing=A -d '{"title": "Document"}' {"_index":"documents","_type":"doc","_id":"2","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}
id=3
curl -XPUT http://10.93.21.21:8049/documents/doc/3?routing=A -d '{ "title": "Document"}' {"_index":"documents","_type":"doc","_id":"3","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}
查询一下，可以看到document中是带有_routing键的。
curl -XGET 'http://10.93.21.21:8049/documents/_search?pretty' { "took" : 51, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 1.0, "hits" : [ { "_index" : "documents", "_type" : "doc", "_id" : "1", "_score" : 1.0, "_routing" : "A", "_source":{ "title": "Document"} }, { "_index" : "documents", "_type" : "doc", "_id" : "2", "_score" : 1.0, "_routing" : "A", "_source":{ "title": "Document"} }, { "_index" : "documents", "_type" : "doc", "_id" : "3", "_score" : 1.0, "_routing" : "A", "_source":{ "title": "Document"} } ] } }
在Paramedic中查看

4）在查询中使用路由

使用路由键“A”进行查询，可以看到_shards.total=1，便可知只查询了一个分片，这个分片便是路由键“A”算出的分片，在这个分片中可以查出我们以路由键“A”存入的数据
curl -XGET 'http://10.93.21.21:8049/documents/_search?pretty&q=*:*&routing=A' { "took" : 17, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 1.0, "hits" : [ { "_index" : "documents", "_type" : "doc", "_id" : "1", "_score" : 1.0, "_routing" : "A", "_source":{ "title": "Document"} }, { "_index" : "documents", "_type" : "doc", "_id" : "2", "_score" : 1.0, "_routing" : "A", "_source":{ "title": "Document"} }, { "_index" : "documents", "_type" : "doc", "_id" : "3", "_score" : 1.0, "_routing" : "A", "_source":{ "title": "Document"} } ] } }
使用路由键“B”，可以看到_shards.total=1，也是只查询由路由键“B”指定的分片，在这个分片中不能查出我们以路由键“A”存入的数据
curl -XGET 'http://10.93.21.21:8049/documents/_search?pretty&q=*:*&routing=B' { "took" : 14, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] } }
总结一下路由的优点

1）只查询一个shard，避免在其他shard无用的查询与master上的合并，提升了查询效率。

2）在nodes与shards较多的大规模集群中，在多个shards上查询出现failed的可能性较大，在master上合并后，对数据完整性并不能很好的确定，使用routing可以有效避免。例如在total=64个shards的索引上查询，successful=60，failed=4，这时候对合并的数据，我们不能保证其是完整的。
```
 
```
查看全文

相关阅读:
mysql 查询优化 ~ select count 知多少
 mongodb 案例 ~ 经典故障案例
 printk 驱动调试
 21天学通C++学习笔记（七）：函数
 OPC UA
MQTT
分库分表
 水平、垂直权限问题（横向越权与纵向越权）
数据库中的行转列和列转行
 面试知识点

原文地址：https://www.cnblogs.com/kangoroo/p/7622957.html