zoukankan      html  css  js  c++  java
  • 细说Lucene源码(一):索引文件锁机制

    大家都知道,在多线程或多进程的环境中,对统一资源的访问需要特别小心,特别是在写资源时,如果不加锁,将会导致很多严重的后果,Lucene的索引也是如此,lucene对索引的读写分为IndexReader和IndexWriter,顾名思义,一个读,一个写,lucene可以对同一个索引文件建立多个IndexReader对象,但是只能有一个IndexWriter对象,这是怎么做到的呢?显而易见是需要加锁的,加锁可以保证一个索引文件只能建立一个IndexWriter对象。下面就细说Lucene索引文件锁机制:

    如果我们对同一个索引文件建立多个不同的IndexWriter会怎么样呢?

    IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
    
    IndexWriter indexWriter = new IndexWriter(dir, indexWriterConfig);
    
     
    
    IndexWriterConfig indexWriterConfig2 = new IndexWriterConfig(analyzer);
    
    IndexWriter indexWriter2 = new IndexWriter(dir,indexWriterConfig2);

    运行后,控制台输出:

    Exception in thread "main" org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@C:UsersnewDesktopLucenewrite.lock
    
        at org.apache.lucene.store.Lock.obtain(Lock.java:89)
    
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
    
        at test.Index.index(Index.java:51)
    
        at test.Index.main(Index.java:78)

    显然是不可以对同一个索引文件开启多个IndexWriter。

     

    上面是一个比较简略的类图,可以看到lucene采用了工厂方法,这样可以方便扩展其他实现,这里只以SimpleFsLock为例说明lucene的锁机制(其他的有兴趣可以看lucene源码)。

    Lock类是锁的基类,一个抽象类,源码如下:

    public abstract class Lock implements Closeable {
    
      /** How long {@link #obtain(long)} waits, in milliseconds,
       *  in between attempts to acquire the lock. */
      public static long LOCK_POLL_INTERVAL = 1000;
    
      /** Pass this value to {@link #obtain(long)} to try
       *  forever to obtain the lock. */
      public static final long LOCK_OBTAIN_WAIT_FOREVER = -1;
    
      /** Attempts to obtain exclusive access and immediately return
       *  upon success or failure.  Use {@link #close} to
       *  release the lock.
       * @return true iff exclusive access is obtained
       */
      public abstract boolean obtain() throws IOException;
    
      /**
       * If a lock obtain called, this failureReason may be set
       * with the "root cause" Exception as to why the lock was
       * not obtained.
       */
      protected Throwable failureReason;
    
      /** Attempts to obtain an exclusive lock within amount of
       *  time given. Polls once per {@link #LOCK_POLL_INTERVAL}
       *  (currently 1000) milliseconds until lockWaitTimeout is
       *  passed.
       * @param lockWaitTimeout length of time to wait in
       *        milliseconds or {@link
       *        #LOCK_OBTAIN_WAIT_FOREVER} to retry forever
       * @return true if lock was obtained
       * @throws LockObtainFailedException if lock wait times out
       * @throws IllegalArgumentException if lockWaitTimeout is
       *         out of bounds
       * @throws IOException if obtain() throws IOException
       */
      public final boolean obtain(long lockWaitTimeout) throws IOException {
        failureReason = null;
        boolean locked = obtain();
        if (lockWaitTimeout < 0 && lockWaitTimeout != LOCK_OBTAIN_WAIT_FOREVER)
          throw new IllegalArgumentException("lockWaitTimeout should be LOCK_OBTAIN_WAIT_FOREVER or a non-negative number (got " + lockWaitTimeout + ")");
    
        long maxSleepCount = lockWaitTimeout / LOCK_POLL_INTERVAL;
        long sleepCount = 0;
        while (!locked) {
          if (lockWaitTimeout != LOCK_OBTAIN_WAIT_FOREVER && sleepCount++ >= maxSleepCount) {
            String reason = "Lock obtain timed out: " + this.toString();
            if (failureReason != null) {
              reason += ": " + failureReason;
            }
            throw new LockObtainFailedException(reason, failureReason);
          }
          try {
            Thread.sleep(LOCK_POLL_INTERVAL);
          } catch (InterruptedException ie) {
            throw new ThreadInterruptedException(ie);
          }
          locked = obtain();
        }
        return locked;
      }
    
      /** Releases exclusive access. */
      public abstract void close() throws IOException;
    
      /** Returns true if the resource is currently locked.  Note that one must
       * still call {@link #obtain()} before using the resource. */
      public abstract boolean isLocked() throws IOException;
    
    
      /** Utility class for executing code with exclusive access. */
      public abstract static class With {
        private Lock lock;
        private long lockWaitTimeout;
    
    
        /** Constructs an executor that will grab the named lock. */
        public With(Lock lock, long lockWaitTimeout) {
          this.lock = lock;
          this.lockWaitTimeout = lockWaitTimeout;
        }
    
        /** Code to execute with exclusive access. */
        protected abstract Object doBody() throws IOException;
    
        /** Calls {@link #doBody} while <i>lock</i> is obtained.  Blocks if lock
         * cannot be obtained immediately.  Retries to obtain lock once per second
         * until it is obtained, or until it has tried ten times. Lock is released when
         * {@link #doBody} exits.
         * @throws LockObtainFailedException if lock could not
         * be obtained
         * @throws IOException if {@link Lock#obtain} throws IOException
         */
        public Object run() throws IOException {
          boolean locked = false;
          try {
             locked = lock.obtain(lockWaitTimeout);
             return doBody();
          } finally {
            if (locked) {
              lock.close();
            }
          }
        }
      }
    
    }

    里面最重要的方法就是obtain(),这个方法用来维持锁,建立锁之后,维持时间为LOCK_POLL_INTERVAL,之后需要重新申请维持锁,这样做是为了支持多线程读写。当然也可以将lockWaitTimeout设置为-1,这样就是一直维持写锁。

    抽象基类LockFactory,只定义了一个抽象方法makeLock,返回Lock对象的一个实例。

    public abstract class LockFactory {
    
      /**
       * Return a new Lock instance identified by lockName.
       * @param lockName name of the lock to be created.
       */
      public abstract Lock makeLock(Directory dir, String lockName);
    
    }

    抽象类FSLockFactory继承Lock:

    public abstract class FSLockFactory extends LockFactory {
      
      /** Returns the default locking implementation for this platform.
       * This method currently returns always {@link NativeFSLockFactory}.
       */
      public static final FSLockFactory getDefault() {
        return NativeFSLockFactory.INSTANCE;
      }
    
      @Override
      public final Lock makeLock(Directory dir, String lockName) {
        if (!(dir instanceof FSDirectory)) {
          throw new UnsupportedOperationException(getClass().getSimpleName() + " can only be used with FSDirectory subclasses, got: " + dir);
        }
        return makeFSLock((FSDirectory) dir, lockName);
      }
      
      /** Implement this method to create a lock for a FSDirectory instance. */
      protected abstract Lock makeFSLock(FSDirectory dir, String lockName);
    
    }

    可以看到

    public static final FSLockFactory getDefault() {

    return NativeFSLockFactory.INSTANCE;

    }

    这个方法默认返回NativeFSLockFactory,和SimpleFSLockFactory一样是一个具体实现,NativeFSLockFactory使用的是nio中FileChannel.tryLock方法,这里不展开讨论,有兴趣的读者可以去看jdk nio的源码(好像现在oracle不提供FileChannel实现类的源码了,需要去jvm里找)。

    下面就是本篇文章的重头戏,SimpleFSLockFactory

    public final class SimpleFSLockFactory extends FSLockFactory {
    
      /**
       * Singleton instance
       */
      public static final SimpleFSLockFactory INSTANCE = new SimpleFSLockFactory();
      
      private SimpleFSLockFactory() {}
    
      @Override
      protected Lock makeFSLock(FSDirectory dir, String lockName) {
        return new SimpleFSLock(dir.getDirectory(), lockName);
      }
      
      static class SimpleFSLock extends Lock {
    
        Path lockFile;
        Path lockDir;
    
        public SimpleFSLock(Path lockDir, String lockFileName) {
          this.lockDir = lockDir;
          lockFile = lockDir.resolve(lockFileName);
        }
    
        @Override
        public boolean obtain() throws IOException {
          try {
            Files.createDirectories(lockDir);
            Files.createFile(lockFile);
            return true;
          } catch (IOException ioe) {
            // On Windows, on concurrent createNewFile, the 2nd process gets "access denied".
            // In that case, the lock was not aquired successfully, so return false.
            // We record the failure reason here; the obtain with timeout (usually the
            // one calling us) will use this as "root cause" if it fails to get the lock.
            failureReason = ioe;
            return false;
          }
        }
    
        @Override
        public void close() throws LockReleaseFailedException {
          // TODO: wierd that clearLock() throws the raw IOException...
          try {
            Files.deleteIfExists(lockFile);
          } catch (Throwable cause) {
            throw new LockReleaseFailedException("failed to delete " + lockFile, cause);
          }
        }
    
        @Override
        public boolean isLocked() {
          return Files.exists(lockFile);
        }
    
        @Override
        public String toString() {
          return "SimpleFSLock@" + lockFile;
        }
      }
    
    }

    在SimpleFSLockFactory定义了一个内部类SimpleFSLock继承Lock,我们还是主要看SimpleFSLockFactory的obtain方法,这里就是SimpleFSLock具体实现文件锁的代码。

    Files.createDirectories(lockDir);
    
    Files.createFile(lockFile);

    可以看着两行代码,createDirectories建立write.lock(可以是别的文件名,lucene默认使用write.lock)文件所在的文件夹及父文件夹。createFile则是创建write.lock文件,这里有一个精妙的地方,如果write.lock已经存在,那么createFile则会抛出异常,如果抛出异常,则表明SimpleFSLockFactory维持文件锁失败,也即意味着别的进程正在写索引文件。

    看到close()方法中Files.deleteIfExists(lockFile); 就表示如果每次关闭IndexWriter,则会删除write.lock文件。

    总结一下,SimpleFSLockFactory加文件锁的机制可以通俗的理解为,在索引文件所在的目录下,创建一个write.lock文件,如果此文件夹下已经有write.lock文件,则表明已经有其他进程在写当前的索引目录,所以此次添加文件锁失败,也即不能像索引文件中添加信息。每次添加完信息后,则会删除write.lock文件,释放文件锁。也即如果write.lock文件存在,就表明已经有进程在写索引文件,如果write.lock不存在就创建文件并添加了文件锁,别的进程不能写文件

    这是一个非常精妙的方式去实现写文件锁,当然可能有些读者会疑惑为什么自己在Demo中,创建完索引,close后还有write.lock文件存在,因为现在lucene的默认实现是NativeFSLockFactory,也是上文提及的使用nio调用本地方法去实现的lock。

  • 相关阅读:
    排序算法(一)冒泡法
    java是传值还是传引用
    赫夫曼树与赫夫曼编码
    数据结构的相关概念
    字符集和字符编码的区别
    redis为什么选择单线程工作模型
    GET和POST请求的核心区别
    MySQL数据类型及后面小括号的意义
    java中的数据类型
    Jedis无法连接centOS7上的redis
  • 原文地址:https://www.cnblogs.com/edwinchen/p/4815714.html
Copyright © 2011-2022 走看看