zoukankan      html  css  js  c++  java
  • ElasticSearch快照备份及恢复

    1、repository-hdfs的安装

     1)去elasticsearch官网下载repository-hdfs安装包

    elasticsearch-5.4.0对应的版本是repository-hdfs-5.4.0)

    下载地址:

    https://www.elastic.co/guide/en/elasticsearch/plugins/5.4/repository-hdfs.html

    2)将压缩包拷到集群下,进入elasticsearch目录:

    执行安装:

    sudo bin/elasticsearch-plugin install

    file:///home/huangyan/repository-hdfs-5.4.0.zip

    2、源集群创建仓库

    源集群创建仓库:

    curl -XPUT 'http://host:9200/_snapshot/my_hdfs_repository?pretty' -d '{
        "type": "hdfs",
        "settings": {
            "uri": "hdfs://host:8020",
            "path": "elasticsearch/repositories/my_hdfs_repository",
            "conf.dfs.client.read.shortcircuit": "false"        
        }
    }'

    这里conf.dfs.client.read.shortcircuit如果设置为true,那么hdfs里需要配置一些额外的东西,设置为true能减少通信次数,加快速度,如果不想折腾,还是建议设置为false。

    查看创建好的仓库:

    curl -XGET 'http://10.45.*:9200/_snapshot/my_hdfs_repository?pretty'

     

    删除仓库:
    curl -XDELETE 'http://10.45.*:9200/_snapshot/my_hdfs_repository?pretty'

     

    3、索引备份

    这里备份history_data_index-00002索引:

    curl -XPUT 'http://10.45.157.*:9200/_snapshot/my_hdfs_repository/snapshot_2?wait_for_completion=false&pretty' -d '{

      "indices": "history_data_index-00002",

      "ignore_unavailable": true,

      "include_global_state": false

    }'

    参数解释:

    wait_for_completion=true会一直等待备份结束。

    wait_for_completion=false会立即返回,备份在后台进行,可以使用下面的api查看备份的进度:

    curl -XGET '10.45.*:9200/_snapshot/my_hdfs_repository/snapshot_2/_status?pretty'

    "ignore_unavailable": true忽略有问题的shard

    "include_global_state": false快照里不放入集群global信息

    注意:

    如果执行上述命令式报出could not read repository data from index blob的异常,如下图,则是java的权限问题

    需要修改配置如下:

    1)修改plugin-security.policy文件,添加内容如下:

      permission javax.security.auth.AuthPermission "getSubject";

      permission javax.security.auth.AuthPermission "doAs";

      permission javax.security.auth.AuthPermission "modifyPrivateCredentials";

      permission java.lang.RuntimePermission "accessDeclaredMembers";

      permission java.lang.RuntimePermission "getClassLoader";

      permission java.lang.RuntimePermission "shutdownHooks";

      permission java.lang.reflect.ReflectPermission "suppressAccessChecks";

      permission javax.security.auth.AuthPermission "doAs";

      permission javax.security.auth.AuthPermission "getSubject";

      permission javax.security.auth.AuthPermission "modifyPrivateCredentials";

      permission java.security.AllPermission;

      permission java.util.PropertyPermission "*", "read,write";

      permission javax.security.auth.PrivateCredentialPermission "org.apache.hadoop.security.Credentials * "*"", "read";

     

    2)还需要手动配置一次/usr/elk/elasticsearch/config/jvm.options文件,在jvm.options文件中添加以下信息:

    -Djava.security.policy=/usr/elk/elasticsearch/plugins/repository-hdfs/plugin-security.policy

     

    3)重启ES,再次执行上面的索引备份即可成功

    查看快照信息:
    curl -XGET 'http://10.45.*:9200/_snapshot/my_hdfs_repository/snapshot_3?pretty'
    查看所有的快照信息:
    curl -XGET 'http://10.45.*:9200/_snapshot/my_hdfs_repository/_all?pretty'

    删除快照:
    curl -XDELETE 'http://10.45.*:9200/_snapshot/my_hdfs_repository/snapshot_1_restore?pretty'

    4、恢复快照

    curl -XPOST 'http://10.45.*:9200/_snapshot/my_hdfs_repository/snapshot_2/_restore?pretty' -d '{
      "indices": "history_data_index-00002",
       "index_settings": {
        "index.number_of_replicas": 1
      },
      "ignore_index_settings": [
        "index.refresh_interval"
      ]
    }'

    恢复快照的时候分片的数量是不能改变的(要想改变分片数量只能re-index)。但是副本的数量是可以重新指定的(index.number_of_replicas

    如果集群中有与要恢复的索引名字相同的索引,可以通过"rename_pattern""rename_replacement"参数来对索引进行重命名,下面命令就可以将person_list_data_index_yinchuan索引的名称改为restored_index_yinchuan

    curl -XPOST 'http://10.45.*:9200/_snapshot/my_hdfs_repository/snapshot_3/_restore?pretty' -d '{
      "indices": "person_list_data_index_yinchuan",
      "ignore_unavailable": "true",
      "include_global_state": false,
      "rename_pattern": "person_list_data_index_(.+)",
      "rename_replacement": "restored_index_$1"
    }'

    查看恢复状态:
    curl -XGET 'http://10.45.*:9200/_recovery/'

    如果是在别的集群上进行快照恢复,需要在目标集群创建仓库:

    curl -XPUT 'http://目标host:9200/_snapshot/my_backup?pretty' -d '{
        "type": "hdfs",
        "settings": {
            "uri": "hdfs://待备份host:8020",
            "path": "/user/master/elasticsearch/repositories/my_hdfs_repository",
            "conf.dfs.client.read.shortcircuit": "false"        
        }
    }'

    然后恢复:

    curl -XPOST 'http://目标host:9200/_snapshot/my_hdfs_repository/snapshot_2/_restore?pretty' -d '{
      "indices": "history_data_index-00002",
       "index_settings": {
        "index.number_of_replicas": 1
      },
      "ignore_index_settings": [
        "index.refresh_interval"
      ]
    }'

    如果按照索引的别名创建快照的话,恢复时直接全部恢复:
    curl -XPOST 'http://10.45.*:9200/_snapshot/my_hdfs_repository/snapshot_4/_restore?pretty'

    5、补充:

    修改包:
    需要将/usr/elk/elasticsearch/plugins/repository-hdfs路径下的一些包的版本改为和hdfs相同的版本,例如我现在是2.7.1的版本,要改为2.6.0的版本。
    /usr/cdh/phoenix/lib路径下有2.6.0的版本,需要改的包有:hadoop-annotations-2.7.1.jar,hadoop-auth-2.7.1.jar,hadoop-client-2.7.1.jar,
    hadoop-common-2.7.1.jar,hadoop-hdfs-2.7.1.jar
    还需要将htrace-core-3.1.0-incubating.jar改为htrace-core4-4.0.1-incubating.jar才能成功重启es

    查看所有的jar包:
    cd /opt/cloudera/parcels/CDH/jars/
    ls
    将htrace-core4-4.0.1-incubating.jar拷贝到/usr/elk/elasticsearch/plugins/repository-hdfs/下:
    cp htrace-core4-4.0.1-incubating.jar /usr/elk/elasticsearch/plugins/repository-hdfs/

    查看hdfs下的路径:
    查看根目录下的子目录:sudo -u hdfs hadoop fs -ls /
    查看/user下面的子目录:sudo -u hdfs hadoop fs -ls /user
    创建仓库时,如果path设置为:"path": "elasticsearch/repositories/my_hdfs_repository",
    则其存储的路径为:/user/elasticsearch/elasticsearch/repositories/my_hdfs_repository
    查看仓库下的快照: sudo -u hdfs hadoop fs -ls /user/elasticsearch/elasticsearch/repositories/my_hdfs_repository

     6、测试

    1、备份532,391条数据1.52G(3.03G)共花费208541ms,大概3分半钟

        恢复532391条数据,花费时间大概为6.5s

    2、备份1,578,227条数据9.09G(18.1G)共花费1510737ms,大概25分钟

         恢复1,578,227条数据,花费时间大概为105s

    总体来说快照备份的速度不是很快,建议直接用reindex来迁移索引,但是要注意,5.4.0版本的es是不支持跨集群reindex的

     

  • 相关阅读:
    Android中的“再按一次返回键退出程序”代码实现
    Android UI编程之自定义控件初步——ImageButton
    21岁,我想当“大帅”
    茑萝改变了我
    茑萝,梦想的加油站
    放弃了我的国企工作
    性能调优之访问日志IO性能优化
    性能调优之提高 ASP.NET Web 应用性能的 24 种方法和技巧
    性能调优之剖析OutOfMemoryError
    老李分享:单元测试检查清单:让测试有效,避免致命错误
  • 原文地址:https://www.cnblogs.com/zling/p/10404999.html
Copyright © 2011-2022 走看看