zoukankan      html  css  js  c++  java
  • Solr初始化源码分析-Solr初始化与启动

         用solr做项目已经有一年有余,但都是使用层面,只是利用solr现有机制,修改参数,然后监控调优,从没有对solr进行源码级别的研究。但是,最近手头的一个项目,让我感觉必须把solrn内部原理和扩展机制弄熟,才能把这个项目做好。今天分享的就是:Solr是如何启动并且初始化的。大家知道,部署solr时,分两部分:一、solr的配置文件。二、solr相关的程序、插件、依赖lucene相关的jar包、日志方面的jar。因此,在研究solr也可以顺着这个思路:加载配置文件、初始化各个core、初始化各个core中的requesthandler...

      研究solr的启动,首先从solr war程序的web.xml分析开始,下面是solr的web.xml片段:

    <web-app xmlns="http://java.sun.com/xml/ns/javaee"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
             version="2.5"
             metadata-complete="true"
    >
    
    
      <!-- Uncomment if you are trying to use a Resin version before 3.0.19.
        Their XML implementation isn't entirely compatible with Xerces.
        Below are the implementations to use with Sun's JVM.
      <system-property javax.xml.xpath.XPathFactory=
                 "com.sun.org.apache.xpath.internal.jaxp.XPathFactoryImpl"/>
      <system-property javax.xml.parsers.DocumentBuilderFactory=
                 "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl"/>
      <system-property javax.xml.parsers.SAXParserFactory=
                 "com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl"/>
       -->
    
      <!-- People who want to hardcode their "Solr Home" directly into the
           WAR File can set the JNDI property here...
       -->
        <!--  Solr配置文件的参数,用于Solr初始化使用  -->
        <env-entry>
           <env-entry-name>solr/home</env-entry-name>
           <env-entry-value>R:/solrhome1/solr</env-entry-value>
           <env-entry-type>java.lang.String</env-entry-type>
        </env-entry>
    
       
      
      <!-- org.apache.solr.servlet.SolrDispatchFilter  Solr启动最重要的东东,所以针对solr源码分析,要对这个Filter开始,它主要的作用:加载solr配置文件、初始化各个core、初始化各个requestHandler和component -->
      <filter>
        <filter-name>SolrRequestFilter</filter-name>
        <filter-class>org.apache.solr.servlet.SolrDispatchFilter</filter-class>
        <!-- If you are wiring Solr into a larger web application which controls
             the web context root, you will probably want to mount Solr under
             a path prefix (app.war with /app/solr mounted into it, for example).
             You will need to put this prefix in front of the SolrDispatchFilter
             url-pattern mapping too (/solr/*), and also on any paths for
             legacy Solr servlet mappings you may be using.
             For the Admin UI to work properly in a path-prefixed configuration,
             the admin folder containing the resources needs to be under the app context root
             named to match the path-prefix.  For example:
    
                .war
                   xxx
                     js
                       main.js
        -->
        <!--
        <init-param>
          <param-name>path-prefix</param-name>
          <param-value>/xxx</param-value>
        </init-param>
        -->
      </filter>
    

      SolrDispatchFilter 是继承BaseSolrFilter的一个Filter(Filter的作用是啥,大家应该清楚吧,一般web框架级别的产品源码分析都是从filter或者servlet开始)。在介绍SolrDispatchFilter之前,先介绍一下BaseSolrFilter(也许程序员都有刨根问底的习惯)。BaseSolrFilter,是一个实现Filter接口的抽象类,功能很简单,就是判断当前程序是否已经加载日志方面的jar。代码片段如下:

      

    /**
     * All Solr filters available to the user's webapp should
     * extend this class and not just implement {@link Filter}.
     * This class ensures that the logging configuration is correct
     * before any Solr specific code is executed.
     */
    abstract class BaseSolrFilter implements Filter {
      
      static {//
        CheckLoggingConfiguration.check();
      }
      
    }
    

      着于篇幅,我就不介绍CheckLoggingConfiguration.check() 这里面的东东了。OK,我们回到SolrDispatchFilter上。由于BaseSolrFilter是一个抽象类,所有作为非抽象类的SolrDispatchFilter必须要实现Filter接口。Filter接口如下:

      

    public interface Filter {
    
        //进行初始化
        public void init(FilterConfig filterConfig) throws ServletException;
    	
       //拦截所有的http请求
        public void doFilter(ServletRequest request, ServletResponse response,
                             FilterChain chain)
                throws IOException, ServletException;
    
       //进行注销的动作
        public void destroy();
    }
    

      根据上面的注释,我们知道在init方法中是进行初始化的。因此,今天咱们研究SolrDispatchFilter是如何初始化,是离不开这个方法的。接下来,咱们看看SolrDispatchFilter的init方法吧:

      

      @Override
      public void init(FilterConfig config) throws ServletException
      {
        log.info("SolrDispatchFilter.init()");
    
        try {
          // web.xml configuration
          this.pathPrefix = config.getInitParameter( "path-prefix" );
          //各位看客,乾坤尽在此方法中
          this.cores = createCoreContainer();
          log.info("user.dir=" + System.getProperty("user.dir"));
        }
        catch( Throwable t ) {
          // catch this so our filter still works
          log.error( "Could not start Solr. Check solr/home property and the logs");
          SolrCore.log( t );
          if (t instanceof Error) {
            throw (Error) t;
          }
        }
    
        log.info("SolrDispatchFilter.init() done");
      }
    

      咱们顺藤摸瓜,来看看createCoreContainer这个方法到底干了些什么。

      

      protected CoreContainer createCoreContainer() {
      //看好了SolrResourceLoader 是用来加载solr home中的配置文件文件的 SolrResourceLoader loader = new SolrResourceLoader(SolrResourceLoader.locateSolrHome()); //加载配置文件
    ConfigSolr config = loadConfigSolr(loader); CoreContainer cores = new CoreContainer(loader, config);
       //初始化Core cores.load(); return cores; }

      createCoreContainer这个方法是决定咱们今天能否弄懂Solr初始化和启动的关键。我们顺便简单分析一下这个方法中用到的几个类和方法:

      SolrResourceLoader  类如其名,是solr资源加载器。

         ConfigSolr 是通过SolrResourceLoader来读取solr配置文件的中信息的。

        loadConfigSolr,加载配置信息的方法:

      private ConfigSolr loadConfigSolr(SolrResourceLoader loader) {
        //优先读取solr.solrxml.location配置的信息,往往是通过读取zookeeper中的配置信息进行初始化的,如果没有配置,就会读取solrhome配置项配置的信息(记得web.xml第一个配置项否,就是它)
        String solrxmlLocation = System.getProperty("solr.solrxml.location", "solrhome");
        
        if (solrxmlLocation == null || "solrhome".equalsIgnoreCase(solrxmlLocation))
          return ConfigSolr.fromSolrHome(loader, loader.getInstanceDir());
         //ok 从zookeeper中读取配置信息吧,这是在solrcloud集群下用来solr初始化的
        if ("zookeeper".equalsIgnoreCase(solrxmlLocation)) {
          String zkHost = System.getProperty("zkHost");
          log.info("Trying to read solr.xml from " + zkHost);
          if (StringUtils.isEmpty(zkHost))
            throw new SolrException(ErrorCode.SERVER_ERROR,
                "Could not load solr.xml from zookeeper: zkHost system property not set");
          SolrZkClient zkClient = new SolrZkClient(zkHost, 30000);
          try {
            if (!zkClient.exists("/solr.xml", true))//solr.xml里有描述的zookeeper相关的配置信息
              throw new SolrException(ErrorCode.SERVER_ERROR, "Could not load solr.xml from zookeeper: node not found");
            byte[] data = zkClient.getData("/solr.xml", null, null, true);
    //加载配置信息 return ConfigSolr.fromInputStream(loader, new ByteArrayInputStream(data)); } catch (Exception e) { throw new SolrException(ErrorCode.SERVER_ERROR, "Could not load solr.xml from zookeeper", e); } finally { zkClient.close();//关闭zookeeper连接 } } throw new SolrException(ErrorCode.SERVER_ERROR, "Bad solr.solrxml.location set: " + solrxmlLocation + " - should be 'solrhome' or 'zookeeper'"); }

      CoreContainer  就是进行Core初始化工作的。我们主要看看load方法吧,这段方法有点长,代码如下:

      

    public void load()  {
    
        log.info("Loading cores into CoreContainer [instanceDir={}]", loader.getInstanceDir());
         //加载solr共享jar包库
        // add the sharedLib to the shared resource loader before initializing cfg based plugins
        String libDir = cfg.getSharedLibDirectory();
        if (libDir != null) {
          File f = FileUtils.resolvePath(new File(solrHome), libDir);
          log.info("loading shared library: " + f.getAbsolutePath());
    //对classloader不熟的,可以进去看看 loader.addToClassLoader(libDir, null, false); loader.reloadLuceneSPI(); } //分片相关的handler加载以及初始化 shardHandlerFactory = ShardHandlerFactory.newInstance(cfg.getShardHandlerFactoryPluginInfo(), loader); updateShardHandler = new UpdateShardHandler(cfg); solrCores.allocateLazyCores(cfg.getTransientCacheSize(), loader); logging = LogWatcher.newRegisteredLogWatcher(cfg.getLogWatcherConfig(), loader); hostName = cfg.getHost(); log.info("Host Name: " + hostName); zkSys.initZooKeeper(this, solrHome, cfg); collectionsHandler = createHandler(cfg.getCollectionsHandlerClass(), CollectionsHandler.class); infoHandler = createHandler(cfg.getInfoHandlerClass(), InfoHandler.class); coreAdminHandler = createHandler(cfg.getCoreAdminHandlerClass(), CoreAdminHandler.class); //zookeeper 配置信息初始化solr core coreConfigService = cfg.createCoreConfigService(loader, zkSys.getZkController()); containerProperties = cfg.getSolrProperties("solr"); // setup executor to load cores in parallel // do not limit the size of the executor in zk mode since cores may try and wait for each other.
    //多线程初始化core 不熟悉多线的可以驻足研究一会 ExecutorService coreLoadExecutor = Executors.newFixedThreadPool( ( zkSys.getZkController() == null ? cfg.getCoreLoadThreadCount() : Integer.MAX_VALUE ), new DefaultSolrThreadFactory("coreLoadExecutor") ); try { CompletionService<SolrCore> completionService = new ExecutorCompletionService<>( coreLoadExecutor); Set<Future<SolrCore>> pending = new HashSet<>(); List<CoreDescriptor> cds = coresLocator.discover(this); checkForDuplicateCoreNames(cds); for (final CoreDescriptor cd : cds) { final String name = cd.getName(); try { if (cd.isTransient() || ! cd.isLoadOnStartup()) { // Store it away for later use. includes non-transient but not // loaded at startup cores. solrCores.putDynamicDescriptor(name, cd); } if (cd.isLoadOnStartup()) { // The normal case Callable<SolrCore> task = new Callable<SolrCore>() { @Override public SolrCore call() { SolrCore c = null; try { if (zkSys.getZkController() != null) {//zookeeper模式 preRegisterInZk(cd); } c = create(cd);//普通创建模式 registerCore(cd.isTransient(), name, c, false, false); } catch (Exception e) { SolrException.log(log, null, e); try { /* if (isZooKeeperAware()) { try { zkSys.zkController.unregister(name, cd); } catch (InterruptedException e2) { Thread.currentThread().interrupt(); SolrException.log(log, null, e2); } catch (KeeperException e3) { SolrException.log(log, null, e3); } }*/ } finally { if (c != null) { c.close(); } } } return c; } }; pending.add(completionService.submit(task)); } } catch (Exception e) { SolrException.log(log, null, e); } } while (pending != null && pending.size() > 0) { try { //获取创建完成的core Future<SolrCore> future = completionService.take(); if (future == null) return; pending.remove(future); try { SolrCore c = future.get(); // track original names if (c != null) { solrCores.putCoreToOrigName(c, c.getName()); } } catch (ExecutionException e) { SolrException.log(SolrCore.log, "Error loading core", e); } } catch (InterruptedException e) { throw new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, "interrupted while loading core", e); } }
    //solr core的守护线程,在容器关闭或者启动失败的时候,进行资源注销 // Start the background thread backgroundCloser = new CloserThread(this, solrCores, cfg); backgroundCloser.start(); } finally { if (coreLoadExecutor != null) {
    //初始化完成,关闭线程池 ExecutorUtil.shutdownNowAndAwaitTermination(coreLoadExecutor); } } if (isZooKeeperAware()) {//如果zookeeper可用 也就是solrcloud模式 // register in zk in background threads Collection<SolrCore> cores = getCores(); if (cores != null) { for (SolrCore core : cores) { try {
    //讲core的状态信息注册到zookeeper中 zkSys.registerInZk(core, true); } catch (Throwable t) { SolrException.log(log, "Error registering SolrCore", t); } } }
    // zkSys.getZkController().checkOverseerDesignate(); } }

      在这段代码,关键部分我都做了注释。当你需要优化你的solr启动速度时,你还会来研究这段代码。下面,我们将研究solr的请求过滤处理的部分,我们需要关注doFilter那个方法了(关键部分我作以注释,就不细讲了):

         

     if( abortErrorMessage != null ) {//500错误处理
          ((HttpServletResponse)response).sendError( 500, abortErrorMessage );
          return;
        }
        
        if (this.cores == null) {//solr core初始化失败或者已经关闭
          ((HttpServletResponse)response).sendError( 503, "Server is shutting down or failed to initialize" );
          return;
        }
        CoreContainer cores = this.cores;
        SolrCore core = null;
        SolrQueryRequest solrReq = null;
        Aliases aliases = null;
        
        if( request instanceof HttpServletRequest) {//如果是http请求
          HttpServletRequest req = (HttpServletRequest)request;
          HttpServletResponse resp = (HttpServletResponse)response;
          SolrRequestHandler handler = null;
          String corename = "";
          String origCorename = null;
          try {
            // put the core container in request attribute
            req.setAttribute("org.apache.solr.CoreContainer", cores);
            String path = req.getServletPath();
            if( req.getPathInfo() != null ) {
              // this lets you handle /update/commit when /update is a servlet
              path += req.getPathInfo();
            }
            if( pathPrefix != null && path.startsWith( pathPrefix ) ) {
              path = path.substring( pathPrefix.length() );
            }
            // check for management path
            String alternate = cores.getManagementPath();
            if (alternate != null && path.startsWith(alternate)) {
              path = path.substring(0, alternate.length());
            }
            // unused feature ?
            int idx = path.indexOf( ':' );
            if( idx > 0 ) {
              // save the portion after the ':' for a 'handler' path parameter
              path = path.substring( 0, idx );
            }
    
            // Check for the core admin page
            if( path.equals( cores.getAdminPath() ) ) {//solr admin 管理页面请求
              handler = cores.getMultiCoreHandler();
              solrReq =  SolrRequestParsers.DEFAULT.parse(null,path, req);
              handleAdminRequest(req, response, handler, solrReq);
              return;
            }
            boolean usingAliases = false;
            List<String> collectionsList = null;
            // Check for the core admin collections url
            if( path.equals( "/admin/collections" ) ) {//管理collections 
              handler = cores.getCollectionsHandler();
              solrReq =  SolrRequestParsers.DEFAULT.parse(null,path, req);
              handleAdminRequest(req, response, handler, solrReq);
              return;
            }
            // Check for the core admin info url
            if( path.startsWith( "/admin/info" ) ) {//查看admin info
              handler = cores.getInfoHandler();
              solrReq =  SolrRequestParsers.DEFAULT.parse(null,path, req);
              handleAdminRequest(req, response, handler, solrReq);
              return;
            }
            else {
              //otherwise, we should find a core from the path
              idx = path.indexOf( "/", 1 );
              if( idx > 1 ) {
                // try to get the corename as a request parameter first
                corename = path.substring( 1, idx );
                
                // look at aliases
                if (cores.isZooKeeperAware()) {//solr cloud状态
                  origCorename = corename;
                  ZkStateReader reader = cores.getZkController().getZkStateReader();
                  aliases = reader.getAliases();
                  if (aliases != null && aliases.collectionAliasSize() > 0) {
                    usingAliases = true;
                    String alias = aliases.getCollectionAlias(corename);
                    if (alias != null) {
                      collectionsList = StrUtils.splitSmart(alias, ",", true);
                      corename = collectionsList.get(0);
                    }
                  }
                }
                
                core = cores.getCore(corename);
    
                if (core != null) {
                  path = path.substring( idx );
                }
              }
              if (core == null) {
                if (!cores.isZooKeeperAware() ) {
                  core = cores.getCore("");
                }
              }
            }
            
            if (core == null && cores.isZooKeeperAware()) {
              // we couldn't find the core - lets make sure a collection was not specified instead
              core = getCoreByCollection(cores, corename, path);
              
              if (core != null) {
                // we found a core, update the path
                path = path.substring( idx );
              }
              
              // if we couldn't find it locally, look on other nodes
              if (core == null && idx > 0) {
                String coreUrl = getRemotCoreUrl(cores, corename, origCorename);
                // don't proxy for internal update requests
                SolrParams queryParams = SolrRequestParsers.parseQueryString(req.getQueryString());
                if (coreUrl != null
                    && queryParams
                        .get(DistributingUpdateProcessorFactory.DISTRIB_UPDATE_PARAM) == null) {
                  path = path.substring(idx);
                  remoteQuery(coreUrl + path, req, solrReq, resp);
                  return;
                } else {
                  if (!retry) {
                    // we couldn't find a core to work with, try reloading aliases
                    // TODO: it would be nice if admin ui elements skipped this...
                    ZkStateReader reader = cores.getZkController()
                        .getZkStateReader();
                    reader.updateAliases();
                    doFilter(request, response, chain, true);
                    return;
                  }
                }
              }
              
              // try the default core
              if (core == null) {
                core = cores.getCore("");
              }
            }
    
            // With a valid core...
            if( core != null ) {//验证core
              final SolrConfig config = core.getSolrConfig();
              // get or create/cache the parser for the core
              SolrRequestParsers parser = config.getRequestParsers();
    
              // Handle /schema/* and /config/* paths via Restlet
              if( path.equals("/schema") || path.startsWith("/schema/")
                  || path.equals("/config") || path.startsWith("/config/")) {//solr rest api 入口  
                solrReq = parser.parse(core, path, req);
                SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, new SolrQueryResponse()));
                if( path.equals(req.getServletPath()) ) {
                  // avoid endless loop - pass through to Restlet via webapp
                  chain.doFilter(request, response);
                } else {
                  // forward rewritten URI (without path prefix and core/collection name) to Restlet
                  req.getRequestDispatcher(path).forward(request, response);
                }
                return;
              }
    
              // Determine the handler from the url path if not set
              // (we might already have selected the cores handler)
              if( handler == null && path.length() > 1 ) { // don't match "" or "/" as valid path
                handler = core.getRequestHandler( path );
                // no handler yet but allowed to handle select; let's check
                if( handler == null && parser.isHandleSelect() ) {
                  if( "/select".equals( path ) || "/select/".equals( path ) ) {//solr 各种查询过滤入口 
                    solrReq = parser.parse( core, path, req );
                    String qt = solrReq.getParams().get( CommonParams.QT );
                    handler = core.getRequestHandler( qt );
                    if( handler == null ) {
                      throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "unknown handler: "+qt);
                    }
                    if( qt != null && qt.startsWith("/") && (handler instanceof ContentStreamHandlerBase)) {
                      //For security reasons it's a bad idea to allow a leading '/', ex: /select?qt=/update see SOLR-3161
                      //There was no restriction from Solr 1.4 thru 3.5 and it's not supported for update handlers.
                      throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "Invalid Request Handler ('qt').  Do not use /select to access: "+qt);
                    }
                  }
                }
              }
    
              // With a valid handler and a valid core...
              if( handler != null ) {
                // if not a /select, create the request
                if( solrReq == null ) {
                  solrReq = parser.parse( core, path, req );
                }
    
                if (usingAliases) {
                  processAliases(solrReq, aliases, collectionsList);
                }
                
                final Method reqMethod = Method.getMethod(req.getMethod());
                HttpCacheHeaderUtil.setCacheControlHeader(config, resp, reqMethod);
                // unless we have been explicitly told not to, do cache validation
                // if we fail cache validation, execute the query
                if (config.getHttpCachingConfig().isNever304() ||
                    !HttpCacheHeaderUtil.doCacheHeaderValidation(solrReq, req, reqMethod, resp)) {//solr http 缓存 在header控制失效时间的方式
                    SolrQueryResponse solrRsp = new SolrQueryResponse();
                    /* even for HEAD requests, we need to execute the handler to
                     * ensure we don't get an error (and to make sure the correct
                     * QueryResponseWriter is selected and we get the correct
                     * Content-Type)
                     */
                    SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, solrRsp));
                    this.execute( req, handler, solrReq, solrRsp );
                    HttpCacheHeaderUtil.checkHttpCachingVeto(solrRsp, resp, reqMethod);
                  // add info to http headers
                  //TODO: See SOLR-232 and SOLR-267.  
                    /*try {
                      NamedList solrRspHeader = solrRsp.getResponseHeader();
                     for (int i=0; i<solrRspHeader.size(); i++) {
                       ((javax.servlet.http.HttpServletResponse) response).addHeader(("Solr-" + solrRspHeader.getName(i)), String.valueOf(solrRspHeader.getVal(i)));
                     }
                    } catch (ClassCastException cce) {
                      log.log(Level.WARNING, "exception adding response header log information", cce);
                    }*/
                   QueryResponseWriter responseWriter = core.getQueryResponseWriter(solrReq);
                   writeResponse(solrRsp, response, responseWriter, solrReq, reqMethod);
                }
                return; // we are done with a valid handler
              }
            }
            log.debug("no handler or core retrieved for " + path + ", follow through...");
          } 
          catch (Throwable ex) {
            sendError( core, solrReq, request, (HttpServletResponse)response, ex );
            if (ex instanceof Error) {
              throw (Error) ex;
            }
            return;
          } finally {
            try {
              if (solrReq != null) {
                log.debug("Closing out SolrRequest: {}", solrReq);
                solrReq.close();
              }
            } finally {
              try {
                if (core != null) {
                  core.close();
                }
              } finally {
                SolrRequestInfo.clearRequestInfo();
              }
            }
          }
        }
    
        // Otherwise let the webapp handle the request
        chain.doFilter(request, response);
      }
    

      

    文章转载请注明出处:http://www.cnblogs.com/likehua/p/4353608.html

      

  • 相关阅读:
    aud$定位错误用户密码登陆数据库的具体信息
    Linux 磁盘分区、格式化、目录挂载
    Linux RHCS 基础维护命令
    Vertica 6.1不完全恢复启动到LGE方法
    Oracle Recovery 02
    如何删除回滚段状态为NEEDS RECOVERY的undo表空间
    Oracle Recovery 01
    DRA(Data Recovery Advisor)的使用
    Oracle启动报错ORA-27102解决
    jquery remove()不兼容问题解决方案
  • 原文地址:https://www.cnblogs.com/likehua/p/4353608.html
Copyright © 2011-2022 走看看