elasticsearch 备份和恢复

zoukankan html css js c++ java

elasticsearch 备份和恢复
curl ：

http://keenwon.com/1393.html

During snapshot initialization, information about all previous snapshots is loaded into the memory, which means that in large repositories it may take several seconds (or even minutes) for this command to return even if the wait_for_completion parameter is set to false.

这意味着创建快照时，会占用很大的内存（同时为了计算，也会占用很多CPU）,原因如下段中描述：在创建快照时需要分析已有仓库中的索引

The index snapshot process is incremental. In the process of making the index snapshot Elasticsearch analyses the list of the index files that are already stored in the repository and copies only files that were created or changed since the last snapshot. That allows multiple snapshots to be preserved in the repository in a compact form. Snapshotting process is executed in non-blocking fashion.

快照本质上就是将索引（还有一些集群信息）复制，所谓增量式（incremental）就是仅复制自上次以来新增和改变的文件（索引）。

All indexing and searching operation can continue to be executed against the index that is being snapshotted. However, a snapshot represents the point-in-time view of the index at the moment when snapshot was created, so no records that were added to the index after the snapshot process was started will be present in the snapshot.

创建快照不会影响索引和查询操作。快照是索引的实时反映，所以在创建快照过程中新增的索引都不会在快照中出现。

The snapshot process starts immediately for the primary shards that has been started and are not relocating at the moment. Elasticsearch waits for relocation or initialization of shards to complete before snapshotting them.

Besides creating a copy of each index the snapshot process can also store global cluster metadata, which includes persistent cluster settings and templates. The transient settings and registered snapshot repositories are not stored as part of the snapshot.

快照进程在复制索引时也会存储集群的元信息，包括集群永久设置和模板。临时设置和快照仓库并不会作为快照的一部分存储。

蓝色部分表示怀疑
[root@datanode3 elasticsearch]# cd test/ [root@datanode3 test]# ll 总计 174092 -rw-r--r-- 1 root root 183796 03-04 21:48 123.tgz -rw-r--r-- 1 root root 177894546 03-04 17:15 1.tgz drwxr-xr-x 3 root root 4096 03-04 21:45 repo [root@datanode3 test]# cd repo/ [root@datanode3 repo]# ll 总计 16 -rw-r--r-- 1 root root 32 03-04 21:45 index drwxr-xr-x 3 root root 4096 03-04 21:45 indices -rw-r--r-- 1 root root 252 03-04 21:45 metadata-snapshot_test5 -rw-r--r-- 1 root root 202 03-04 21:45 snapshot-snapshot_test5 [root@datanode3 repo]# cat metadata-snapshot_test5 {"meta-data":{"version":435,"uuid":"IQXbkDFASIu_BlyerMYJcQ","templates":{},"repositories":{"my_backup":{"type":"fs","settings":{"compress":"true","location":"./mount/backups/my_backup"}},"testrepo":{"type":"fs","settings":{"location":"./test/repo"}}}}}
如上所示metadata-snapshot_test5文件中确实有快照仓库信息。

但是通过如下实验：

1.部署两个弹搜集群（两台单机），配置完全相同，一台A有索引（test5），并建立数据仓库（testrepo），创建了一个快照(snapshot_test5)，另一台B没有任何数据。

2.把A快照仓库下的所有文件拷到B中相应的文件夹下，直接通过API做数据恢复，提示缺少快照仓库，说明快照中确实没有存储快照仓库信息（至少应该不全）

3.在B中创建相同名称和路径的快照仓库，把A中快照仓库路径(repo)下的所有文件拷贝到B中相同路径（repo）下，通过API做数据恢复，数据恢复成功，所以可以通过该方式做数据迁移。

虽然不知道这么操作是否安全可靠，但是至少成功了。

Only one snapshot process can be executed in the cluster at any time. While snapshot of a particular shard is being created this shard cannot be moved to another node, which can interfere with rebalancing process and allocation filtering. Elasticsearch will only be able to move a shard to another node (according to the current allocation filtering settings and rebalancing algorithm) once the snapshot is finished.

一个急群众只能有一个快照进程在执行

http://www.elasticsearch.org/blog/introducing-snapshot-restore/

However, while replication can protect a cluster from hardware failures, it doesn’t help when someone accidentally deletes an index. Anyone that relies on an Elasticsearch cluster needs to perform regular backups.

副本和备份有不同的目的：副本机制是为了防止硬盘故障，备份机制是为了防止误删索引。

The snapshot/restore mechanism can be also used to synchronize data between a “hot” cluster and a remote, “cold” backup cluster in a different geographic region for fast disaster recovery.

快照和恢复机制也用来同步“热”集群和远程“冷”备份集群

Java api

http://amsterdam.luminis.eu/2014/12/15/creating-elasticsearch-backups-with-snapshotrestore/
查看全文

相关阅读:
书到用时方恨少---记录读书历程
 JAVASCRIPT数据类型（值类型-引用类型-类型总览）
jQuery基本API小结(下)---工具函数-基本插件
 jQuery基本API小结(上)--选择器-DOM操作-动画-Ajax
【转】javascript 执行环境，变量对象，作用域链
 JavaScript知识总结--对象的相关概念
 JavaScript知识总结--引用类型（Object-Array-Function-Global-Math）
JavaScript知识总结--历史-html引用方式-基础概念
 Java--神奇的hashcode
Java-从堆栈常量池解析equals()与==

原文地址：https://www.cnblogs.com/SamuelSun/p/4309061.html

最新文章
Django 全文检索
 Django 发送邮件
 selenium
Promise
HTTP协议类
 窗口之间通信类
 图片等比整屏缩放
 html和xml的区别
 跨域通信
 自己build zepto.js