zoukankan      html  css  js  c++  java
  • Solr4.8.0源码分析(21)之SolrCloud的Recovery策略(二)

    • Solr4.8.0源码分析(21)之SolrCloud的Recovery策略(二)

    题记:  前文<Solr4.8.0源码分析(20)之SolrCloud的Recovery策略(一)>中提到Recovery有两种策略,一是PeerSync和Replication。本节将具体介绍下PeerSync策略。

         PeeySync是Solr的优先选择策略,每当需要进行recovery了,Solr总是会先去判断是否需要进入PeerSync,只有当PeerSync被设置为跳过或者PeerSync时候发现没符合条件才会进入到Replication。这是由PeeySync的特性决定的,PeeySync是面向中断时间短,需要recovery的document个数较少时使用的策略,因此它Recovery的速度较快,对Solr的影响较小。而Replication则是对中断时间长,需要recovery数量多的情况下进行的,耗时较长。

         前文已经介绍了Recovery的总体流程,那么本文就直接来介绍PeerSync的流程了,请看下图所示:

    • 首先 Solr会向所有Replica发送getversion的请求,来获取最新的nupdate个version(默认是100个)。
     1     // Fire off the requests before getting our own recent updates (for better concurrency)
     2     // This also allows us to avoid getting updates we don't need... if we got our updates and then got their updates, they would
     3     // have newer stuff that we also had (assuming updates are going on and are being forwarded).
     4     for (String replica : replicas) {
     5       requestVersions(replica);
     6     }
     7 
     8   private void requestVersions(String replica) {
     9     SyncShardRequest sreq = new SyncShardRequest();
    10     sreq.purpose = 1;
    11     sreq.shards = new String[]{replica};
    12     sreq.actualShards = sreq.shards;
    13     sreq.params = new ModifiableSolrParams();
    14     sreq.params.set("qt","/get");
    15     sreq.params.set("distrib",false);
    16     sreq.params.set("getVersions",nUpdates);
    17     shardHandler.submit(sreq, replica, sreq.params);
    18   }
    • 获取本分片最新的nupdate个version(默认是100个),并对这些version进行排序。
    1     recentUpdates = ulog.getRecentUpdates();
    2     try {
    3       ourUpdates = recentUpdates.getVersions(nUpdates);
    4     } finally {
    5       recentUpdates.close();
    6     }
    7 
    8     Collections.sort(ourUpdates, absComparator);
    • 获取recovery之前的version信息startingversions。通过比较startingversions与ourUpdates可以来比较recovery期间是否有索引更新。
    • 检查ourUpdates和startingversions是否有交集,由于ourUpdates和startingversions的version个数是限制为nUpdates的,也就是判断索引更新的个数是否大于nUpdate。如果需要更新的索引太多即ourUpdates和startingversions无交集,则进入Replication。
    1       // now make sure that the starting updates overlap our updates
    2       // there shouldn't be reorders, so any overlap will do.
    3 
    4       long smallestNewUpdate = Math.abs(ourUpdates.get(ourUpdates.size()-1));
    5 
    6       if (Math.abs(startingVersions.get(0)) < smallestNewUpdate) {
    7         log.warn(msg() + "too many updates received since start - startingUpdates no longer overlaps with our currentUpdates");
    8         return false;
    9       }
    • 如果ourUpdates和startingversions有交集,则合并两个列表,即求并集。
    1       // let's merge the lists
    2       List<Long> newList = new ArrayList<>(ourUpdates);
    3       for (Long ver : startingVersions) {
    4         if (Math.abs(ver) < smallestNewUpdate) {
    5           newList.add(ver);
    6         }
    7       }
    8 
    9       ourUpdates = newList;
    • 本分片的version比别的分片低,则进入Replication策略。这里进行分片version的比较,并没有按version的最大或者最小值,而是比较0.8和0.2比例处的version。
     1     long otherHigh = percentile(otherVersions, .2f);
     2     long otherLow = percentile(otherVersions, .8f);
     3 
     4     if (ourHighThreshold < otherLow) {
     5       // Small overlap between version windows and ours is older
     6       // This means that we might miss updates if we attempted to use this method.
     7       // Since there exists just one replica that is so much newer, we must
     8       // fail the sync.
     9       log.info(msg() + " Our versions are too old. ourHighThreshold="+ourHighThreshold + " otherLowThreshold="+otherLow);
    10       return false;
    11     }
    • 如果本分片的version比其他分片高,则说明不需要进行recovery直接退出peersync。
    1     if (ourLowThreshold > otherHigh) {
    2       // Small overlap between windows and ours is newer.
    3       // Using this list to sync would result in requesting/replaying results we don't need
    4       // and possibly bringing deleted docs back to life.
    5       log.info(msg() + " Our versions are newer. ourLowThreshold="+ourLowThreshold + " otherHigh="+otherHigh);
    6       return true;
    7     }
    • 对本分片的version和其他分片的version求差,获取本分片缺少的version。
     1     for (Long otherVersion : otherVersions) {
     2       // stop when the entries get old enough that reorders may lead us to see updates we don't need
     3       if (!completeList && Math.abs(otherVersion) < ourLowThreshold) break;
     4 
     5       if (ourUpdateSet.contains(otherVersion) || requestedUpdateSet.contains(otherVersion)) {
     6         // we either have this update, or already requested it
     7         // TODO: what if the shard we previously requested this from returns failure (because it goes
     8         // down)
     9         continue;
    10       }
    11 
    12       toRequest.add(otherVersion);
    13       requestedUpdateSet.add(otherVersion);
    14     }
    • 最后向其他分片发送getupdate命令,根据处理后的version获取相应的document,至此完成peersync过程
     1   private boolean requestUpdates(ShardResponse srsp, List<Long> toRequest) {
     2     String replica = srsp.getShardRequest().shards[0];
     3 
     4     log.info(msg() + "Requesting updates from " + replica + "n=" + toRequest.size() + " versions=" + toRequest);
     5 
     6     // reuse our original request object
     7     ShardRequest sreq = srsp.getShardRequest();
     8 
     9     sreq.purpose = 0;
    10     sreq.params = new ModifiableSolrParams();
    11     sreq.params.set("qt", "/get");
    12     sreq.params.set("distrib", false);
    13     sreq.params.set("getUpdates", StrUtils.join(toRequest, ','));
    14     sreq.params.set("onlyIfActive", onlyIfActive);
    15     sreq.responses.clear();  // needs to be zeroed for correct correlation to occur
    16 
    17     shardHandler.submit(sreq, sreq.shards[0], sreq.params);
    18 
    19     return true;
    20   }

    总结:

          本文具体介绍PeerSync的过程,由此可见PeerSync策略的recovery过程还是比较简单的,下一节将具体介绍Replication策略,这个较PeerSync复杂。

     

     

     

     

  • 相关阅读:
    7 重排序与happens-before
    6 Java内存模型基础知识
    5 Java线程间的通信
    Java线程的状态及主要转化方法
    《The Boost C++ Libraries》 第一章 智能指针
    python通过swig调用静态库
    使用gdb调试
    Rsync服务部署使用
    UNP学习总结(二)
    read()函数的困惑
  • 原文地址:https://www.cnblogs.com/rcfeng/p/4147711.html
Copyright © 2011-2022 走看看