zoukankan      html  css  js  c++  java
  • RocketMQ(五)消息持久化存储源码解析

    一、原理

    1、消息存在哪了?

    消息持久化的地方其实是磁盘上,在如下目录里的commitlog文件夹里。

    /root/store/commitlog

    源码如下:

    // {@link org.apache.rocketmq.store.config.MessageStoreConfig}
    
    // 数据存储根目录
    private String storePathRootDir = System.getProperty("user.home") + File.separator + "store";
    // commitlog目录
    private String storePathCommitLog = System.getProperty("user.home") + File.separator + "store" + File.separator + "commitlog";
    // 每个commitlog文件大小为1GB,超过1GB则创建新的commitlog文件
    private int mappedFileSizeCommitLog = 1024 * 1024 * 1024;

    比如验证下

    [root@iZ2ze84zygpzjw5bfcmh2hZ commitlog]# pwd
    /root/store/commitlog
    [root@iZ2ze84zygpzjw5bfcmh2hZ commitlog]# ll -h
    total 400K
    -rw-r--r-- 1 root root 1.0G Jun 30 18:21 00000000000000000000
    [root@iZ2ze84zygpzjw5bfcmh2hZ commitlog]#

    可以清晰的看到文件大小是1.0G,超过1.0G再写入消息的话会自动创建新的commitlog文件。

    2、关键类解释

    2.1、MappedFile

    对应的是commitlog文件,比如上面的00000000000000000000文件。

    2.2、MappedFileQueue

    MappedFile 所在的文件夹,对 MappedFile 进行封装成文件队列。

    2.3、CommitLog

    针对 MappedFileQueue 的封装使用。

    二、Broker接收消息

    1、调用链

    BrokerStartup.start() -》 BrokerController.start() -》 NettyRemotingServer.start() -》 NettyRemotingServer.prepareSharableHandlers() -》 new NettyServerHandler() -》 NettyRemotingAbstract.processMessageReceived() 
    -》 NettyRemotingAbstract.processRequestCommand() -》 SendMessageProcessor.processRequest()

    2、processRequest

    SendMessageProcessor.processRequest()

    @Override
    public RemotingCommand processRequest(ChannelHandlerContext ctx,
                                          RemotingCommand request) throws RemotingCommandException {
        RemotingCommand response = null;
        try {
            // 调用asyncProcessRequest
            response = asyncProcessRequest(ctx, request).get();
        } catch (InterruptedException | ExecutionException e) {
            log.error("process SendMessage error, request : " + request.toString(), e);
        }
        return response;
    }
    
    
    

    3、asyncProcessRequest

    public CompletableFuture<RemotingCommand> asyncProcessRequest(ChannelHandlerContext ctx,
                                                                      RemotingCommand request) throws RemotingCommandException {
        final SendMessageContext mqtraceContext;
        switch (request.getCode()) {
            // 表示消费者发送的消息,发送者消费失败会重新发回队列进行消息重试
            case RequestCode.CONSUMER_SEND_MSG_BACK:
                return this.asyncConsumerSendMsgBack(ctx, request);
            default:
                // 解析header,也就是我们Producer发送过来的消息都在request里,给他解析到SendMessageRequestHeader对象里去。
                SendMessageRequestHeader requestHeader = parseRequestHeader(request);
                if (requestHeader == null) {
                    return CompletableFuture.completedFuture(null);
                }
                mqtraceContext = buildMsgContext(ctx, requestHeader);
                // 将解析好的参数放到SendMessageContext对象里
                this.executeSendMessageHookBefore(ctx, request, mqtraceContext);
                if (requestHeader.isBatch()) {
                    // 批处理消息用
                    return this.asyncSendBatchMessage(ctx, request, mqtraceContext, requestHeader);
                } else {
                    // 非批处理,我们这里介绍的核心。
                    return this.asyncSendMessage(ctx, request, mqtraceContext, requestHeader);
                }
        }
    }
    
    
    

    4、asyncSendMessage

    private CompletableFuture<RemotingCommand> asyncSendMessage(ChannelHandlerContext ctx, RemotingCommand request,
                                                                    SendMessageContext mqtraceContext,
                                                                    SendMessageRequestHeader requestHeader) {
        final byte[] body = request.getBody();
    
        int queueIdInt = requestHeader.getQueueId();
        TopicConfig topicConfig = this.brokerController.getTopicConfigManager().selectTopicConfig(requestHeader.getTopic());
    
        // 拼凑message对象
        MessageExtBrokerInner msgInner = new MessageExtBrokerInner();
        msgInner.setTopic(requestHeader.getTopic());
        msgInner.setQueueId(queueIdInt);
        msgInner.setBody(body);
        msgInner.setFlag(requestHeader.getFlag());
        MessageAccessor.setProperties(msgInner, MessageDecoder.string2messageProperties(requestHeader.getProperties()));
        msgInner.setPropertiesString(requestHeader.getProperties());
        msgInner.setBornTimestamp(requestHeader.getBornTimestamp());
        msgInner.setBornHost(ctx.channel().remoteAddress());
        msgInner.setStoreHost(this.getStoreHost());
        msgInner.setReconsumeTimes(requestHeader.getReconsumeTimes() == null ? 0 : requestHeader.getReconsumeTimes());
        
        CompletableFuture<PutMessageResult> putMessageResult = null;
        Map<String, String> origProps = MessageDecoder.string2messageProperties(requestHeader.getProperties());
        // 真正接收消息的方法
        putMessageResult = this.brokerController.getMessageStore().asyncPutMessage(msgInner);
        return handlePutMessageResultFuture(putMessageResult, response, request, msgInner, responseHeader, mqtraceContext, ctx, queueIdInt);
    }

    至此我们的消息接收完成了,都封装到了MessageExtBrokerInner对象里。

    三、Broker消息存储(持久化)

    1、asyncPutMessage

    接着上步骤的asyncSendMessage继续看

    @Override
    public CompletableFuture<PutMessageResult> asyncPutMessage(MessageExtBrokerInner msg) {
        CompletableFuture<PutMessageResult> putResultFuture = this.commitLog.asyncPutMessage(msg);
        putResultFuture.thenAccept((result) -> {
            ......
        });
        return putResultFuture;
    }

    2、commitLog.asyncPutMessage

    public CompletableFuture<PutMessageResult> asyncPutMessage(final MessageExtBrokerInner msg) {
        // 获取最后一个文件,MappedFile就是commitlog目录下的那个0000000000文件
        MappedFile mappedFile = this.mappedFileQueue.getLastMappedFile();
        try {
            // 追加数据到commitlog
            result = mappedFile.appendMessage(msg, this.appendMessageCallback);
            switch (result.getStatus()) {
                ......
            }
            // 将内存的数据持久化到磁盘
            CompletableFuture<PutMessageStatus> flushResultFuture = submitFlushRequest(result, putMessageResult, msg);
        }
    }

    3、appendMessagesInner

    public AppendMessageResult appendMessagesInner(final MessageExt messageExt, final AppendMessageCallback cb) {
        // 将消息写到内存
        return cb.doAppend(this.getFileFromOffset(), byteBuffer, this.fileSize - currentPos, (MessageExtBrokerInner) messageExt);
    }

    4、doAppend

    @Override
    public AppendMessageResult doAppend(final long fileFromOffset, final ByteBuffer byteBuffer, final int maxBlank,
                                        final MessageExtBrokerInner msgInner) {
        // Initialization of storage space
        this.resetByteBuffer(msgStoreItemMemory, msgLen);
        // 1 TOTALSIZE
        this.msgStoreItemMemory.putInt(msgLen);
        // 2 MAGICCODE
        this.msgStoreItemMemory.putInt(CommitLog.MESSAGE_MAGIC_CODE);
        // 3 BODYCRC
        this.msgStoreItemMemory.putInt(msgInner.getBodyCRC());
        // 4 QUEUEID
        this.msgStoreItemMemory.putInt(msgInner.getQueueId());
        // 5 FLAG
        this.msgStoreItemMemory.putInt(msgInner.getFlag());
        // 6 QUEUEOFFSET
        this.msgStoreItemMemory.putLong(queueOffset);
        // 7 PHYSICALOFFSET
        this.msgStoreItemMemory.putLong(fileFromOffset + byteBuffer.position());
        // 8 SYSFLAG
        this.msgStoreItemMemory.putInt(msgInner.getSysFlag());
        // 9 BORNTIMESTAMP
        this.msgStoreItemMemory.putLong(msgInner.getBornTimestamp());
        // 10 BORNHOST
        this.resetByteBuffer(bornHostHolder, bornHostLength);
        this.msgStoreItemMemory.put(msgInner.getBornHostBytes(bornHostHolder));
        // 11 STORETIMESTAMP
        this.msgStoreItemMemory.putLong(msgInner.getStoreTimestamp());
        // 12 STOREHOSTADDRESS
        this.resetByteBuffer(storeHostHolder, storeHostLength);
        this.msgStoreItemMemory.put(msgInner.getStoreHostBytes(storeHostHolder));
        // 13 RECONSUMETIMES
        this.msgStoreItemMemory.putInt(msgInner.getReconsumeTimes());
        // 14 Prepared Transaction Offset
        this.msgStoreItemMemory.putLong(msgInner.getPreparedTransactionOffset());
        // 15 BODY
        this.msgStoreItemMemory.putInt(bodyLength);
        if (bodyLength > 0)
            this.msgStoreItemMemory.put(msgInner.getBody());
        // 16 TOPIC
        this.msgStoreItemMemory.put((byte) topicLength);
        this.msgStoreItemMemory.put(topicData);
        // 17 PROPERTIES
        this.msgStoreItemMemory.putShort((short) propertiesLength);
        if (propertiesLength > 0)
            this.msgStoreItemMemory.put(propertiesData);
    
        final long beginTimeMills = CommitLog.this.defaultMessageStore.now();
        // Write messages to the queue buffer
        byteBuffer.put(this.msgStoreItemMemory.array(), 0, msgLen);
        return result;
    }

    这一步其实就已经把消息保存到缓冲区里了,也就是msgStoreItemMemory,这里采取的NIO。

    private final ByteBuffer msgStoreItemMemory;

    5、submitFlushRequest

    再次回到【2、commitLog.asyncPutMessage】的submitFlushRequest方法,因为之前的方法是将数据已经写到ByteBuffer缓冲区里了,下一步也就是我们现在这一步就要刷盘了。

    public CompletableFuture<PutMessageStatus> submitFlushRequest(AppendMessageResult result, PutMessageResult putMessageResult,
                                                                  MessageExt messageExt) {
        // 同步刷盘
        if (FlushDiskType.SYNC_FLUSH == this.defaultMessageStore.getMessageStoreConfig().getFlushDiskType()) {
            final GroupCommitService service = (GroupCommitService) this.flushCommitLogService;
            if (messageExt.isWaitStoreMsgOK()) {
                GroupCommitRequest request = new GroupCommitRequest(result.getWroteOffset() + result.getWroteBytes(),
                                                                    this.defaultMessageStore.getMessageStoreConfig().getSyncFlushTimeout());
                service.putRequest(request);
                return request.future();
            } else {
                service.wakeup();
                return CompletableFuture.completedFuture(PutMessageStatus.PUT_OK);
            }
        }
        // 异步刷盘
        else {
            if (!this.defaultMessageStore.getMessageStoreConfig().isTransientStorePoolEnable()) {
                
                flushCommitLogService.wakeup();
            } else  {
                commitLogService.wakeup();
            }
            return CompletableFuture.completedFuture(PutMessageStatus.PUT_OK);
        }
    }

    6、异步刷盘

    class FlushRealTimeService extends FlushCommitLogService {
        @Override
        public void run() {
            while (!this.isStopped()) {
                try {
        // 每隔500ms刷一次盘
                    if (flushCommitLogTimed) {
                        Thread.sleep(500);
                    } else {
                        this.waitForRunning(500);
                    }
                    // 调用mappedFileQueue的flush方法
                    CommitLog.this.mappedFileQueue.flush(flushPhysicQueueLeastPages);
                } catch (Throwable e) {
                }
            }
        }
    }

    可看出默认是每隔500毫秒刷一次盘

    7、mappedFileQueue.flush

    public boolean flush(final int flushLeastPages) {
        MappedFile mappedFile = this.findMappedFileByOffset(this.flushedWhere, this.flushedWhere == 0);
        if (mappedFile != null) {
            // 真正的刷盘操作
            int offset = mappedFile.flush(flushLeastPages);
        }
    }

    8、mappedFile.flush

    public int flush(final int flushLeastPages) {
        if (this.isAbleToFlush(flushLeastPages)) {
            try {
                if (writeBuffer != null || this.fileChannel.position() != 0) {
                    // 刷盘   NIO
                    this.fileChannel.force(false);
                } else {
        // 刷盘  NIO
                    this.mappedByteBuffer.force();
                }
            } catch (Throwable e) {
                log.error("Error occurred when force data to disk.", e);
            }
        }
        return this.getFlushedPosition();
    }

    至此已经全部结束。

    四、总结

    面试被问:Broker收到消息后怎么持久化的?

    回答者:有两种方式:同步和异步。一般选择异步,同步效率低,但是更可靠。消息存储大致原理是:

    核心类MappedFile对应的是每个commitlog文件,MappedFileQueue相当于文件夹,管理所有的文件,还有一个管理者CommitLog对象,他负责提供一些操作。具体的是Broker端拿到消息后先将消息、topic、queue等内容存到ByteBuffer里,然后去持久化到commitlog文件中。commitlog文件大小为1G,超出大小会新创建commitlog文件来存储,采取的nio方式。

    五、补充:同步/异步刷盘

    1、关键类

    类名描述刷盘性能
    CommitRealTimeService 异步刷盘 &&开启字节缓冲区 最高
    FlushRealTimeService 异步刷盘&&关闭内存字节缓冲区 较高
    GroupCommitService 同步刷盘,刷完盘才会返回消息写入成功 最低

    2、图解

    3、同步刷盘

    3.1、源码

    // {@link org.apache.rocketmq.store.CommitLog#submitFlushRequest()}
    // Synchronization flush
    if (FlushDiskType.SYNC_FLUSH == this.defaultMessageStore.getMessageStoreConfig().getFlushDiskType()) {
        // 同步刷盘service -> GroupCommitService
        final GroupCommitService service = (GroupCommitService) this.flushCommitLogService;
        if (messageExt.isWaitStoreMsgOK()) {
            // 数据准备
            GroupCommitRequest request = new GroupCommitRequest(result.getWroteOffset() + result.getWroteBytes(),
                                                     this.defaultMessageStore.getMessageStoreConfig().getSyncFlushTimeout());
            // 将数据对象放到requestsWrite里
            service.putRequest(request);
            return request.future();
        } else {
            service.wakeup();
            return CompletableFuture.completedFuture(PutMessageStatus.PUT_OK);
        }
    }

    putRequest

    public synchronized void putRequest(final GroupCommitRequest request) {
        synchronized (this.requestsWrite) {
            this.requestsWrite.add(request);
        }
        // 这里很关键!!!,给他设置成true。然后计数器-1。下面run方法的时候才会进行交换数据且return
        if (hasNotified.compareAndSet(false, true)) {
            waitPoint.countDown(); // notify
        }
    }

    run

    public void run() {
        while (!this.isStopped()) {
            try {
                // 是同步还是异步的关键方法,也就是说组不阻塞全看这里。
                this.waitForRunning(10);
                // 真正的刷盘逻辑
                this.doCommit();
            } catch (Exception e) {
                CommitLog.log.warn(this.getServiceName() + " service has exception. ", e);
            }
        }
    }

    waitForRunning

    protected volatile AtomicBoolean hasNotified = new AtomicBoolean(false);
    // 其实就是CountDownLatch
    protected final CountDownLatch2 waitPoint = new CountDownLatch2(1);
    
    protected void waitForRunning(long interval) {
        // 如果是true,且给他改成false成功的话,则onWaitEnd()且return,但是默认是false,也就是默认情况下这个if不会进。
        if (hasNotified.compareAndSet(true, false)) {
            this.onWaitEnd();
            return;
        }
    
        //entry to wait
        waitPoint.reset();
    
        try {
            // 等待,默认值是1,也就是waitPoint.countDown()一次后就会激活这里。
            waitPoint.await(interval, TimeUnit.MILLISECONDS);
        } catch (InterruptedException e) {
            log.error("Interrupted", e);
        } finally {
            // 给状态值设置成false
            hasNotified.set(false);
            this.onWaitEnd();
        }
    }

    3.2、总结

    总结下同步刷盘的主要流程:

    核心类是GroupCommitService,核心方法 是waitForRunning。

    • 先调用putRequest方法将hasNotified变为true,且进行notify,也就是waitPoint.countDown()
    • 其次是run方法里的waitForRunning()waitForRunning()判断hasNotified是不是true,是true则交换数据然后return掉,也就是不进行await阻塞,直接return。
    • 最后上一步return了,没有阻塞,那么顺理成章的调用doCommit进行真正意义的刷盘。

    4、异步刷盘

    4.1、源码

    核心类是:FlushRealTimeService

    // {@link org.apache.rocketmq.store.CommitLog#submitFlushRequest()}
    // Asynchronous flush
    if (!this.defaultMessageStore.getMessageStoreConfig().isTransientStorePoolEnable()) {
        flushCommitLogService.wakeup();
    } else  {
        commitLogService.wakeup();
    }
    return CompletableFuture.completedFuture(PutMessageStatus.PUT_OK);

    run

    // {@link org.apache.rocketmq.store.CommitLog.FlushRealTimeService#run()}
    
    class FlushRealTimeService extends FlushCommitLogService {
        @Override
        public void run() {
            while (!this.isStopped()) {
                try {
        // 每隔500ms刷一次盘
                    if (flushCommitLogTimed) {
                        Thread.sleep(500);
                    } else {
                        // 根上面同步刷盘调用的是同一个方法,区别在于这里没有将hasNotified变为true,也就是还是默认的false,那么waitForRunning方法内部的第一个判断就不会走,就不会return掉,就会进行下面的await方法阻塞,默认阻塞时间是500毫秒。也就是默认500ms刷一次盘。
                        this.waitForRunning(500);
                    }
                    // 调用mappedFileQueue的flush方法
                    CommitLog.this.mappedFileQueue.flush(flushPhysicQueueLeastPages);
                } catch (Throwable e) {
                }
            }
        }
    }

    4.2、总结

    核心类#方法:FlushRealTimeService#run()

    • 判断flushCommitLogTimed是不是true,默认false,是true则直接sleep(500ms)然后进行mappedFileQueue.flush()刷盘。
    • 若是false,则进入waitForRunning(500),这里是和同步刷盘的区别关键所在,同步刷盘之前将hasNotified变为true了,所以直接一套小连招:return+doCommit了 ,异步这里直接调用的waitForRunning(500),在这之前没任何对hasNotified的操作,所以不会return,而是会继续走下面的waitPoint.await(500, TimeUnit.MILLISECONDS);进行阻塞500毫秒,500毫秒后自动唤醒然后进行flush刷盘。也就是异步刷盘的话默认500ms刷盘一次。

    @Override
    public RemotingCommand processRequest(ChannelHandlerContext ctx,
                                          RemotingCommand request)
     throws RemotingCommandException 
    {
        RemotingCommand response = null;
        try {
            // 调用asyncProcessRequest
            response = asyncProcessRequest(ctx, request).get();
        } catch (InterruptedException | ExecutionException e) {
            log.error("process SendMessage error, request : " + request.toString(), e);
        }
        return response;
    }

  • 相关阅读:
    分布式日志收集系统: Facebook Scribe之日志收集方案
    20111030 19:37 杨辉三角形 (java)
    pku acm 1833 排列
    俞敏洪郑大演讲经典语句
    自己在inode客户端的大量问题(不断更新中)(20120223 21:24 )
    智力测验:硬币问题
    windows up可以更新但是无法上网的一天挣扎
    hdu1754 I Hate It
    acm算法资源网站
    pku3041 Asteroids
  • 原文地址:https://www.cnblogs.com/sxw123/p/13827069.html
Copyright © 2011-2022 走看看