RedLock官方文档:https://redis.io/topics/distlock
RedLock由来
在redis的master- slave的集群架构中,就是如果你对某个redis master实例,写入了DISLOCK这种锁key的value,此时会异步复制给对应的master slave实例。
但是,这个过程中一旦发生redis master宕机,主备切换,redis slave变为了redis master。而此时的主从复制没有彻底完成.....
接着就会导致,客户端2来尝试加锁的时候,在新的redis master上完成了加锁,而客户端1也以为自己成功加了锁。
此时就会导致多个客户端对一个分布式锁完成了加锁。(对于业务,可能会造成严重的问题)
所以这个是redis master-slave架构的主从异步复制导致的redis分布式锁的最大缺陷:在redis master实例宕机的时候,可能导致多个客户端同时完成加锁。
所以Redis官方提出了一种新的redis分布式锁的算法:RedLock,用于解决上述缺陷。
RedLock算法
RedLock算法思想:
- 不能只在一个redis实例上创建锁,应该是在多个redis实例上创建锁,n / 2 + 1,必须在大多数redis节点上都成功创建锁,才能算这个整体的RedLock加锁成功,避免说仅仅在一个redis实例上加锁而带来的问题。(多个redis实例没有主从的关系)
官网原文:
In the distributed version of the algorithm we assume we have N Redis masters. Those nodes are totally independent, so we don’t use replication or any other implicit coordination system. We already described how to acquire and release the lock safely in a single instance. We take for granted that the algorithm will use this method to acquire and release the lock in a single instance. In our examples we set N=5, which is a reasonable value, so we need to run 5 Redis masters on different computers or virtual machines in order to ensure that they’ll fail in a mostly independent way.
In order to acquire the lock, the client performs the following operations:
- It gets the current time in milliseconds.
- It tries to acquire the lock in all the N instances sequentially, using the same key name and random value in all the instances. During step 2, when setting the lock in each instance, the client uses a timeout which is small compared to the total lock auto-release time in order to acquire it. For example if the auto-release time is 10 seconds, the timeout could be in the ~ 5-50 milliseconds range. This prevents the client from remaining blocked for a long time trying to talk with a Redis node which is down: if an instance is not available, we should try to talk with the next instance ASAP.
- The client computes how much time elapsed in order to acquire the lock, by subtracting from the current time the timestamp obtained in step 1. If and only if the client was able to acquire the lock in the majority of the instances (at least 3), and the total time elapsed to acquire the lock is less than lock validity time, the lock is considered to be acquired.
- If the lock was acquired, its validity time is considered to be the initial validity time minus the time elapsed, as computed in step 3.
- If the client failed to acquire the lock for some reason (either it was not able to lock N/2+1 instances or the validity time is negative), it will try to unlock all the instances (even the instances it believed it was not able to lock).
大致含义如下:
在算法的分布式版本中,我们假设我们有 N 个 Redis 主节点。 这些节点是完全独立的,因此我们不使用复制或任何其他隐式协调系统。 我们已经描述了如何在单个实例中安全地获取和释放锁。 我们理所当然地认为算法会在单个实例中使用这种方法来获取和释放锁。 在我们的示例中,我们设置了 N=5,这是一个合理的值,因此我们需要在不同的计算机或虚拟机上运行 5 个 Redis 主节点,以确保它们以几乎独立的方式失败。
为了获取锁,客户端执行以下操作:
1.它以毫秒为单位获取当前时间。
2.它尝试顺序获取所有 N 个实例中的锁,在所有实例中使用相同的键名和随机值。 在第 2 步中,当在每个实例中设置锁时,客户端使用一个比总锁自动释放时间更小的超时来获取它。 例如,如果自动释放时间为 10 秒,则超时可能在 ~ 5-50 毫秒范围内。 这可以防止客户端长时间保持阻塞状态,试图与已关闭的 Redis 节点通信:如果实例不可用,我们应该尽快尝试与下一个实例通信。
3.客户端通过从当前时间中减去步骤 1 中获得的时间戳来计算获取锁所用的时间。当且仅当客户端能够在大多数实例(至少 3 个)中获取锁,并且获取锁所用的总时间小于锁有效时间,则认为该锁被获取。
4.如果获得了锁,则其有效时间被认为是初始有效时间减去经过的时间,如步骤 3 中计算的那样。
5.如果客户端由于某种原因获取锁失败(或者它无法锁定 N/2+1 个实例或有效时间为负),它将尝试解锁所有实例(即使是它认为没有锁定的实例)能够锁定)。
RedLock是基于redis实现的分布式锁,它能够保证以下特性:
- 互斥性:在任何时候,只能有一个客户端能够持有锁;避免死锁:
- 当客户端拿到锁后,即使发生了网络分区或者客户端宕机,也不会发生死锁;(利用key的存活时间)
- 容错性:只要多数节点的redis实例正常运行,就能够对外提供服务,加锁或者释放锁;
以sentinel模式架构为例,如下图所示,有sentinel-1,sentinel-2,sentinel-3总计3个sentinel模式集群,如果要获取分布式锁,那么需要向这3个sentinel集群通过EVAL命令执行LUA脚本,需要3/2+1=2,即至少2个sentinel集群响应成功,才算成功的以Redlock算法获取到分布式锁:
java版RedLock的实现:Redisson
代码很简单:
RLock lock1 = redisson1.getLock("lock1");
RLock lock2 = redisson2.getLock("lock2");
RLock lock3 = redisson3.getLock("lock3");
RLock redLock = anyRedisson.getRedLock(lock1, lock2, lock3);
// traditional lock method
redLock.lock();
// or acquire lock and automatically unlock it after 10 seconds
redLock.lock(10, TimeUnit.SECONDS);
// or wait for lock aquisition up to 100 seconds
// and automatically unlock it after 10 seconds
boolean res = redLock.tryLock(100, 10, TimeUnit.SECONDS);
if (res) {
try {
...
} finally {
redLock.unlock();
}
}