Monitor half:
Sentinel通过两种方式获取信息:
- INFO命令(通过命令连接)向其他主服务器或从服务器获取信息;
- 订阅hello频道,获取其他Sentinels发布的信息;
一、建立连接:
Sentinel在连接主服务器或者从服务器时,会同时创建命令连接和订阅连接,但是在连接其他Sentinel时,却只会创建命令连接,而不创建订阅连接。
为什么需要订阅连接? todo
void sentinelHandleRedisInstance(sentinelRedisInstance *ri) {
/* ========== MONITORING HALF ============ */
/* Every kind of instance */
/* 建立命令连接(cc),和订阅连接(pc) */
sentinelReconnectInstance(ri);
/* 对实例进行定期操作:INFO(10s | 1s), PUBLISH (2s), PING(1s) */
sentinelSendPeriodicCommands(ri);
/* ============== ACTING HALF ============= */
// ...
// ...
}
-
L5:建立连接过程中使用了hiredis的异步通信API[1],以后补充。
建立订阅连接会执行SUBSCRIBE命令,设置回调sentinelReceiveHelloMessages,这个函数会通过频道接收其他同样监视该服务器(主从?)的Sentinel更新服务器状态,如果是新的Sentinel,会被新建并添加到该服务器实例结构的sentinels字典中。
-
L7: 第二节分析一下周期操作sentinelSendPeriodicCommands;
二、周期操作sentinelSendPeriodicCommands:
void sentinelSendPeriodicCommands(sentinelRedisInstance *ri) {
mstime_t now = mstime();
mstime_t info_period, ping_period;
int retval;
/* Return ASAP if we have already a PING or INFO already pending, or
* in the case the instance is not properly connected. */
if (ri->flags & SRI_DISCONNECTED) return;
/* For INFO, PING, PUBLISH that are not critical commands to send we
* also have a limit of SENTINEL_MAX_PENDING_COMMANDS. We don't
* want to use a lot of memory just because a link is not working
* properly (note that anyway there is a redundant protection about this,
* that is, the link will be disconnected and reconnected if a long
* timeout condition is detected. */
if (ri->pending_commands >= SENTINEL_MAX_PENDING_COMMANDS) return;
/* If this is a slave of a master in O_DOWN condition we start sending
* it INFO every second, instead of the usual SENTINEL_INFO_PERIOD
* period. In this state we want to closely monitor slaves in case they
* are turned into masters by another Sentinel, or by the sysadmin. */
if ((ri->flags & SRI_SLAVE) &&
(ri->master->flags & (SRI_O_DOWN|SRI_FAILOVER_IN_PROGRESS))) {
/* 当前实例是slave,而且它的master已客观下线并且故障转移正在进行中,那么本机sentinel 1s发一次INFO */
info_period = 1000;
} else {
/* 正常情况下是10s发一次INFO */
info_period = SENTINEL_INFO_PERIOD;
}
/* We ping instances every time the last received pong is older than
* the configured 'down-after-milliseconds' time, but every second
* anyway if 'down-after-milliseconds' is greater than 1 second. */
ping_period = ri->down_after_period;
if (ping_period > SENTINEL_PING_PERIOD) ping_period = SENTINEL_PING_PERIOD;
if ((ri->flags & SRI_SENTINEL) == 0 &&
(ri->info_refresh == 0 ||
(now - ri->info_refresh) > info_period)) // 10s or 1s
{
/* Send INFO to masters and slaves, not sentinels. */
/* 实例不是sentinel,并且超过INFO周期,发送INFO */
retval = redisAsyncCommand(ri->cc,
sentinelInfoReplyCallback, NULL, "INFO");
if (retval == REDIS_OK) ri->pending_commands++;
} else if ((now - ri->last_pong_time) > ping_period) { // 1s
/* Send PING to all the three kinds of instances. */
sentinelSendPing(ri);
} else if ((now - ri->last_pub_time) > SENTINEL_PUBLISH_PERIOD) { // 2s
/* PUBLISH hello messages to all the three kinds of instances. */
sentinelSendHello(ri);
}
}
- L22~L25:注意发INFO周期的改变
- L43:发送INFO命令,回调为sentinelInfoReplyCallback;本机sentinel通过这个回调获取主服务器和从服务器的信息,故障转移的role变化也在这个函数执行,以后详解。
- L48:发送PING命令,回调函数只是更新time信息,不细说。
- L51:发送PUBLISH到_sentinel_:hello频道,回调函数只是更新time信息,不细说。