zoukankan      html  css  js  c++  java
  • Solr初始化源码分析-Solr初始化与启动

         用solr做项目已经有一年有余,但都是使用层面,只是利用solr现有机制,修改参数,然后监控调优,从没有对solr进行源码级别的研究。但是,最近手头的一个项目,让我感觉必须把solrn内部原理和扩展机制弄熟,才能把这个项目做好。今天分享的就是:Solr是如何启动并且初始化的。大家知道,部署solr时,分两部分:一、solr的配置文件。二、solr相关的程序、插件、依赖lucene相关的jar包、日志方面的jar。因此,在研究solr也可以顺着这个思路:加载配置文件、初始化各个core、初始化各个core中的requesthandler...

      研究solr的启动,首先从solr war程序的web.xml分析开始,下面是solr的web.xml片段:

    <web-app xmlns="http://java.sun.com/xml/ns/javaee"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
             version="2.5"
             metadata-complete="true"
    >
    
    
      <!-- Uncomment if you are trying to use a Resin version before 3.0.19.
        Their XML implementation isn't entirely compatible with Xerces.
        Below are the implementations to use with Sun's JVM.
      <system-property javax.xml.xpath.XPathFactory=
                 "com.sun.org.apache.xpath.internal.jaxp.XPathFactoryImpl"/>
      <system-property javax.xml.parsers.DocumentBuilderFactory=
                 "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl"/>
      <system-property javax.xml.parsers.SAXParserFactory=
                 "com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl"/>
       -->
    
      <!-- People who want to hardcode their "Solr Home" directly into the
           WAR File can set the JNDI property here...
       -->
        <!--  Solr配置文件的参数,用于Solr初始化使用  -->
        <env-entry>
           <env-entry-name>solr/home</env-entry-name>
           <env-entry-value>R:/solrhome1/solr</env-entry-value>
           <env-entry-type>java.lang.String</env-entry-type>
        </env-entry>
    
       
      
      <!-- org.apache.solr.servlet.SolrDispatchFilter  Solr启动最重要的东东,所以针对solr源码分析,要对这个Filter开始,它主要的作用:加载solr配置文件、初始化各个core、初始化各个requestHandler和component -->
      <filter>
        <filter-name>SolrRequestFilter</filter-name>
        <filter-class>org.apache.solr.servlet.SolrDispatchFilter</filter-class>
        <!-- If you are wiring Solr into a larger web application which controls
             the web context root, you will probably want to mount Solr under
             a path prefix (app.war with /app/solr mounted into it, for example).
             You will need to put this prefix in front of the SolrDispatchFilter
             url-pattern mapping too (/solr/*), and also on any paths for
             legacy Solr servlet mappings you may be using.
             For the Admin UI to work properly in a path-prefixed configuration,
             the admin folder containing the resources needs to be under the app context root
             named to match the path-prefix.  For example:
    
                .war
                   xxx
                     js
                       main.js
        -->
        <!--
        <init-param>
          <param-name>path-prefix</param-name>
          <param-value>/xxx</param-value>
        </init-param>
        -->
      </filter>
    

      SolrDispatchFilter 是继承BaseSolrFilter的一个Filter(Filter的作用是啥,大家应该清楚吧,一般web框架级别的产品源码分析都是从filter或者servlet开始)。在介绍SolrDispatchFilter之前,先介绍一下BaseSolrFilter(也许程序员都有刨根问底的习惯)。BaseSolrFilter,是一个实现Filter接口的抽象类,功能很简单,就是判断当前程序是否已经加载日志方面的jar。代码片段如下:

      

    /**
     * All Solr filters available to the user's webapp should
     * extend this class and not just implement {@link Filter}.
     * This class ensures that the logging configuration is correct
     * before any Solr specific code is executed.
     */
    abstract class BaseSolrFilter implements Filter {
      
      static {//
        CheckLoggingConfiguration.check();
      }
      
    }
    

      着于篇幅,我就不介绍CheckLoggingConfiguration.check() 这里面的东东了。OK,我们回到SolrDispatchFilter上。由于BaseSolrFilter是一个抽象类,所有作为非抽象类的SolrDispatchFilter必须要实现Filter接口。Filter接口如下:

      

    public interface Filter {
    
        //进行初始化
        public void init(FilterConfig filterConfig) throws ServletException;
    	
       //拦截所有的http请求
        public void doFilter(ServletRequest request, ServletResponse response,
                             FilterChain chain)
                throws IOException, ServletException;
    
       //进行注销的动作
        public void destroy();
    }
    

      根据上面的注释,我们知道在init方法中是进行初始化的。因此,今天咱们研究SolrDispatchFilter是如何初始化,是离不开这个方法的。接下来,咱们看看SolrDispatchFilter的init方法吧:

      

      @Override
      public void init(FilterConfig config) throws ServletException
      {
        log.info("SolrDispatchFilter.init()");
    
        try {
          // web.xml configuration
          this.pathPrefix = config.getInitParameter( "path-prefix" );
          //各位看客,乾坤尽在此方法中
          this.cores = createCoreContainer();
          log.info("user.dir=" + System.getProperty("user.dir"));
        }
        catch( Throwable t ) {
          // catch this so our filter still works
          log.error( "Could not start Solr. Check solr/home property and the logs");
          SolrCore.log( t );
          if (t instanceof Error) {
            throw (Error) t;
          }
        }
    
        log.info("SolrDispatchFilter.init() done");
      }
    

      咱们顺藤摸瓜,来看看createCoreContainer这个方法到底干了些什么。

      

      protected CoreContainer createCoreContainer() {
      //看好了SolrResourceLoader 是用来加载solr home中的配置文件文件的 SolrResourceLoader loader = new SolrResourceLoader(SolrResourceLoader.locateSolrHome()); //加载配置文件
    ConfigSolr config = loadConfigSolr(loader); CoreContainer cores = new CoreContainer(loader, config);
       //初始化Core cores.load(); return cores; }

      createCoreContainer这个方法是决定咱们今天能否弄懂Solr初始化和启动的关键。我们顺便简单分析一下这个方法中用到的几个类和方法:

      SolrResourceLoader  类如其名,是solr资源加载器。

         ConfigSolr 是通过SolrResourceLoader来读取solr配置文件的中信息的。

        loadConfigSolr,加载配置信息的方法:

      private ConfigSolr loadConfigSolr(SolrResourceLoader loader) {
        //优先读取solr.solrxml.location配置的信息,往往是通过读取zookeeper中的配置信息进行初始化的,如果没有配置,就会读取solrhome配置项配置的信息(记得web.xml第一个配置项否,就是它)
        String solrxmlLocation = System.getProperty("solr.solrxml.location", "solrhome");
        
        if (solrxmlLocation == null || "solrhome".equalsIgnoreCase(solrxmlLocation))
          return ConfigSolr.fromSolrHome(loader, loader.getInstanceDir());
         //ok 从zookeeper中读取配置信息吧,这是在solrcloud集群下用来solr初始化的
        if ("zookeeper".equalsIgnoreCase(solrxmlLocation)) {
          String zkHost = System.getProperty("zkHost");
          log.info("Trying to read solr.xml from " + zkHost);
          if (StringUtils.isEmpty(zkHost))
            throw new SolrException(ErrorCode.SERVER_ERROR,
                "Could not load solr.xml from zookeeper: zkHost system property not set");
          SolrZkClient zkClient = new SolrZkClient(zkHost, 30000);
          try {
            if (!zkClient.exists("/solr.xml", true))//solr.xml里有描述的zookeeper相关的配置信息
              throw new SolrException(ErrorCode.SERVER_ERROR, "Could not load solr.xml from zookeeper: node not found");
            byte[] data = zkClient.getData("/solr.xml", null, null, true);
    //加载配置信息 return ConfigSolr.fromInputStream(loader, new ByteArrayInputStream(data)); } catch (Exception e) { throw new SolrException(ErrorCode.SERVER_ERROR, "Could not load solr.xml from zookeeper", e); } finally { zkClient.close();//关闭zookeeper连接 } } throw new SolrException(ErrorCode.SERVER_ERROR, "Bad solr.solrxml.location set: " + solrxmlLocation + " - should be 'solrhome' or 'zookeeper'"); }

      CoreContainer  就是进行Core初始化工作的。我们主要看看load方法吧,这段方法有点长,代码如下:

      

    public void load()  {
    
        log.info("Loading cores into CoreContainer [instanceDir={}]", loader.getInstanceDir());
         //加载solr共享jar包库
        // add the sharedLib to the shared resource loader before initializing cfg based plugins
        String libDir = cfg.getSharedLibDirectory();
        if (libDir != null) {
          File f = FileUtils.resolvePath(new File(solrHome), libDir);
          log.info("loading shared library: " + f.getAbsolutePath());
    //对classloader不熟的,可以进去看看 loader.addToClassLoader(libDir, null, false); loader.reloadLuceneSPI(); } //分片相关的handler加载以及初始化 shardHandlerFactory = ShardHandlerFactory.newInstance(cfg.getShardHandlerFactoryPluginInfo(), loader); updateShardHandler = new UpdateShardHandler(cfg); solrCores.allocateLazyCores(cfg.getTransientCacheSize(), loader); logging = LogWatcher.newRegisteredLogWatcher(cfg.getLogWatcherConfig(), loader); hostName = cfg.getHost(); log.info("Host Name: " + hostName); zkSys.initZooKeeper(this, solrHome, cfg); collectionsHandler = createHandler(cfg.getCollectionsHandlerClass(), CollectionsHandler.class); infoHandler = createHandler(cfg.getInfoHandlerClass(), InfoHandler.class); coreAdminHandler = createHandler(cfg.getCoreAdminHandlerClass(), CoreAdminHandler.class); //zookeeper 配置信息初始化solr core coreConfigService = cfg.createCoreConfigService(loader, zkSys.getZkController()); containerProperties = cfg.getSolrProperties("solr"); // setup executor to load cores in parallel // do not limit the size of the executor in zk mode since cores may try and wait for each other.
    //多线程初始化core 不熟悉多线的可以驻足研究一会 ExecutorService coreLoadExecutor = Executors.newFixedThreadPool( ( zkSys.getZkController() == null ? cfg.getCoreLoadThreadCount() : Integer.MAX_VALUE ), new DefaultSolrThreadFactory("coreLoadExecutor") ); try { CompletionService<SolrCore> completionService = new ExecutorCompletionService<>( coreLoadExecutor); Set<Future<SolrCore>> pending = new HashSet<>(); List<CoreDescriptor> cds = coresLocator.discover(this); checkForDuplicateCoreNames(cds); for (final CoreDescriptor cd : cds) { final String name = cd.getName(); try { if (cd.isTransient() || ! cd.isLoadOnStartup()) { // Store it away for later use. includes non-transient but not // loaded at startup cores. solrCores.putDynamicDescriptor(name, cd); } if (cd.isLoadOnStartup()) { // The normal case Callable<SolrCore> task = new Callable<SolrCore>() { @Override public SolrCore call() { SolrCore c = null; try { if (zkSys.getZkController() != null) {//zookeeper模式 preRegisterInZk(cd); } c = create(cd);//普通创建模式 registerCore(cd.isTransient(), name, c, false, false); } catch (Exception e) { SolrException.log(log, null, e); try { /* if (isZooKeeperAware()) { try { zkSys.zkController.unregister(name, cd); } catch (InterruptedException e2) { Thread.currentThread().interrupt(); SolrException.log(log, null, e2); } catch (KeeperException e3) { SolrException.log(log, null, e3); } }*/ } finally { if (c != null) { c.close(); } } } return c; } }; pending.add(completionService.submit(task)); } } catch (Exception e) { SolrException.log(log, null, e); } } while (pending != null && pending.size() > 0) { try { //获取创建完成的core Future<SolrCore> future = completionService.take(); if (future == null) return; pending.remove(future); try { SolrCore c = future.get(); // track original names if (c != null) { solrCores.putCoreToOrigName(c, c.getName()); } } catch (ExecutionException e) { SolrException.log(SolrCore.log, "Error loading core", e); } } catch (InterruptedException e) { throw new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, "interrupted while loading core", e); } }
    //solr core的守护线程,在容器关闭或者启动失败的时候,进行资源注销 // Start the background thread backgroundCloser = new CloserThread(this, solrCores, cfg); backgroundCloser.start(); } finally { if (coreLoadExecutor != null) {
    //初始化完成,关闭线程池 ExecutorUtil.shutdownNowAndAwaitTermination(coreLoadExecutor); } } if (isZooKeeperAware()) {//如果zookeeper可用 也就是solrcloud模式 // register in zk in background threads Collection<SolrCore> cores = getCores(); if (cores != null) { for (SolrCore core : cores) { try {
    //讲core的状态信息注册到zookeeper中 zkSys.registerInZk(core, true); } catch (Throwable t) { SolrException.log(log, "Error registering SolrCore", t); } } }
    // zkSys.getZkController().checkOverseerDesignate(); } }

      在这段代码,关键部分我都做了注释。当你需要优化你的solr启动速度时,你还会来研究这段代码。下面,我们将研究solr的请求过滤处理的部分,我们需要关注doFilter那个方法了(关键部分我作以注释,就不细讲了):

         

     if( abortErrorMessage != null ) {//500错误处理
          ((HttpServletResponse)response).sendError( 500, abortErrorMessage );
          return;
        }
        
        if (this.cores == null) {//solr core初始化失败或者已经关闭
          ((HttpServletResponse)response).sendError( 503, "Server is shutting down or failed to initialize" );
          return;
        }
        CoreContainer cores = this.cores;
        SolrCore core = null;
        SolrQueryRequest solrReq = null;
        Aliases aliases = null;
        
        if( request instanceof HttpServletRequest) {//如果是http请求
          HttpServletRequest req = (HttpServletRequest)request;
          HttpServletResponse resp = (HttpServletResponse)response;
          SolrRequestHandler handler = null;
          String corename = "";
          String origCorename = null;
          try {
            // put the core container in request attribute
            req.setAttribute("org.apache.solr.CoreContainer", cores);
            String path = req.getServletPath();
            if( req.getPathInfo() != null ) {
              // this lets you handle /update/commit when /update is a servlet
              path += req.getPathInfo();
            }
            if( pathPrefix != null && path.startsWith( pathPrefix ) ) {
              path = path.substring( pathPrefix.length() );
            }
            // check for management path
            String alternate = cores.getManagementPath();
            if (alternate != null && path.startsWith(alternate)) {
              path = path.substring(0, alternate.length());
            }
            // unused feature ?
            int idx = path.indexOf( ':' );
            if( idx > 0 ) {
              // save the portion after the ':' for a 'handler' path parameter
              path = path.substring( 0, idx );
            }
    
            // Check for the core admin page
            if( path.equals( cores.getAdminPath() ) ) {//solr admin 管理页面请求
              handler = cores.getMultiCoreHandler();
              solrReq =  SolrRequestParsers.DEFAULT.parse(null,path, req);
              handleAdminRequest(req, response, handler, solrReq);
              return;
            }
            boolean usingAliases = false;
            List<String> collectionsList = null;
            // Check for the core admin collections url
            if( path.equals( "/admin/collections" ) ) {//管理collections 
              handler = cores.getCollectionsHandler();
              solrReq =  SolrRequestParsers.DEFAULT.parse(null,path, req);
              handleAdminRequest(req, response, handler, solrReq);
              return;
            }
            // Check for the core admin info url
            if( path.startsWith( "/admin/info" ) ) {//查看admin info
              handler = cores.getInfoHandler();
              solrReq =  SolrRequestParsers.DEFAULT.parse(null,path, req);
              handleAdminRequest(req, response, handler, solrReq);
              return;
            }
            else {
              //otherwise, we should find a core from the path
              idx = path.indexOf( "/", 1 );
              if( idx > 1 ) {
                // try to get the corename as a request parameter first
                corename = path.substring( 1, idx );
                
                // look at aliases
                if (cores.isZooKeeperAware()) {//solr cloud状态
                  origCorename = corename;
                  ZkStateReader reader = cores.getZkController().getZkStateReader();
                  aliases = reader.getAliases();
                  if (aliases != null && aliases.collectionAliasSize() > 0) {
                    usingAliases = true;
                    String alias = aliases.getCollectionAlias(corename);
                    if (alias != null) {
                      collectionsList = StrUtils.splitSmart(alias, ",", true);
                      corename = collectionsList.get(0);
                    }
                  }
                }
                
                core = cores.getCore(corename);
    
                if (core != null) {
                  path = path.substring( idx );
                }
              }
              if (core == null) {
                if (!cores.isZooKeeperAware() ) {
                  core = cores.getCore("");
                }
              }
            }
            
            if (core == null && cores.isZooKeeperAware()) {
              // we couldn't find the core - lets make sure a collection was not specified instead
              core = getCoreByCollection(cores, corename, path);
              
              if (core != null) {
                // we found a core, update the path
                path = path.substring( idx );
              }
              
              // if we couldn't find it locally, look on other nodes
              if (core == null && idx > 0) {
                String coreUrl = getRemotCoreUrl(cores, corename, origCorename);
                // don't proxy for internal update requests
                SolrParams queryParams = SolrRequestParsers.parseQueryString(req.getQueryString());
                if (coreUrl != null
                    && queryParams
                        .get(DistributingUpdateProcessorFactory.DISTRIB_UPDATE_PARAM) == null) {
                  path = path.substring(idx);
                  remoteQuery(coreUrl + path, req, solrReq, resp);
                  return;
                } else {
                  if (!retry) {
                    // we couldn't find a core to work with, try reloading aliases
                    // TODO: it would be nice if admin ui elements skipped this...
                    ZkStateReader reader = cores.getZkController()
                        .getZkStateReader();
                    reader.updateAliases();
                    doFilter(request, response, chain, true);
                    return;
                  }
                }
              }
              
              // try the default core
              if (core == null) {
                core = cores.getCore("");
              }
            }
    
            // With a valid core...
            if( core != null ) {//验证core
              final SolrConfig config = core.getSolrConfig();
              // get or create/cache the parser for the core
              SolrRequestParsers parser = config.getRequestParsers();
    
              // Handle /schema/* and /config/* paths via Restlet
              if( path.equals("/schema") || path.startsWith("/schema/")
                  || path.equals("/config") || path.startsWith("/config/")) {//solr rest api 入口  
                solrReq = parser.parse(core, path, req);
                SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, new SolrQueryResponse()));
                if( path.equals(req.getServletPath()) ) {
                  // avoid endless loop - pass through to Restlet via webapp
                  chain.doFilter(request, response);
                } else {
                  // forward rewritten URI (without path prefix and core/collection name) to Restlet
                  req.getRequestDispatcher(path).forward(request, response);
                }
                return;
              }
    
              // Determine the handler from the url path if not set
              // (we might already have selected the cores handler)
              if( handler == null && path.length() > 1 ) { // don't match "" or "/" as valid path
                handler = core.getRequestHandler( path );
                // no handler yet but allowed to handle select; let's check
                if( handler == null && parser.isHandleSelect() ) {
                  if( "/select".equals( path ) || "/select/".equals( path ) ) {//solr 各种查询过滤入口 
                    solrReq = parser.parse( core, path, req );
                    String qt = solrReq.getParams().get( CommonParams.QT );
                    handler = core.getRequestHandler( qt );
                    if( handler == null ) {
                      throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "unknown handler: "+qt);
                    }
                    if( qt != null && qt.startsWith("/") && (handler instanceof ContentStreamHandlerBase)) {
                      //For security reasons it's a bad idea to allow a leading '/', ex: /select?qt=/update see SOLR-3161
                      //There was no restriction from Solr 1.4 thru 3.5 and it's not supported for update handlers.
                      throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "Invalid Request Handler ('qt').  Do not use /select to access: "+qt);
                    }
                  }
                }
              }
    
              // With a valid handler and a valid core...
              if( handler != null ) {
                // if not a /select, create the request
                if( solrReq == null ) {
                  solrReq = parser.parse( core, path, req );
                }
    
                if (usingAliases) {
                  processAliases(solrReq, aliases, collectionsList);
                }
                
                final Method reqMethod = Method.getMethod(req.getMethod());
                HttpCacheHeaderUtil.setCacheControlHeader(config, resp, reqMethod);
                // unless we have been explicitly told not to, do cache validation
                // if we fail cache validation, execute the query
                if (config.getHttpCachingConfig().isNever304() ||
                    !HttpCacheHeaderUtil.doCacheHeaderValidation(solrReq, req, reqMethod, resp)) {//solr http 缓存 在header控制失效时间的方式
                    SolrQueryResponse solrRsp = new SolrQueryResponse();
                    /* even for HEAD requests, we need to execute the handler to
                     * ensure we don't get an error (and to make sure the correct
                     * QueryResponseWriter is selected and we get the correct
                     * Content-Type)
                     */
                    SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, solrRsp));
                    this.execute( req, handler, solrReq, solrRsp );
                    HttpCacheHeaderUtil.checkHttpCachingVeto(solrRsp, resp, reqMethod);
                  // add info to http headers
                  //TODO: See SOLR-232 and SOLR-267.  
                    /*try {
                      NamedList solrRspHeader = solrRsp.getResponseHeader();
                     for (int i=0; i<solrRspHeader.size(); i++) {
                       ((javax.servlet.http.HttpServletResponse) response).addHeader(("Solr-" + solrRspHeader.getName(i)), String.valueOf(solrRspHeader.getVal(i)));
                     }
                    } catch (ClassCastException cce) {
                      log.log(Level.WARNING, "exception adding response header log information", cce);
                    }*/
                   QueryResponseWriter responseWriter = core.getQueryResponseWriter(solrReq);
                   writeResponse(solrRsp, response, responseWriter, solrReq, reqMethod);
                }
                return; // we are done with a valid handler
              }
            }
            log.debug("no handler or core retrieved for " + path + ", follow through...");
          } 
          catch (Throwable ex) {
            sendError( core, solrReq, request, (HttpServletResponse)response, ex );
            if (ex instanceof Error) {
              throw (Error) ex;
            }
            return;
          } finally {
            try {
              if (solrReq != null) {
                log.debug("Closing out SolrRequest: {}", solrReq);
                solrReq.close();
              }
            } finally {
              try {
                if (core != null) {
                  core.close();
                }
              } finally {
                SolrRequestInfo.clearRequestInfo();
              }
            }
          }
        }
    
        // Otherwise let the webapp handle the request
        chain.doFilter(request, response);
      }
    

      

    文章转载请注明出处:http://www.cnblogs.com/likehua/p/4353608.html

      

  • 相关阅读:
    LeetCode Best Time to Buy and Sell Stock
    LeetCode Scramble String
    LeetCode Search in Rotated Sorted Array II
    LeetCode Gas Station
    LeetCode Insertion Sort List
    LeetCode Maximal Rectangle
    Oracle procedure
    浏览器下载代码
    Shell check IP
    KVM- 存储池配置
  • 原文地址:https://www.cnblogs.com/likehua/p/4353608.html
Copyright © 2011-2022 走看看