zoukankan      html  css  js  c++  java
  • 记录一次OOM排查经历

    我是用了netty搭建了一个UDP接收日志,堆启动配置 Xmx256  Xms256 ,项目刚启动的时候,系统进程占用内存很正常,在250M左右。

    长时间运行之后发现,进程占用内存不断增长,远远超过了我设置的堆内存大小,查看幸存者,伊甸园,老年代,gc都很正常,堆使用数据一切正常,甚至我怀疑元空间占用内存大,查询之后发现,元空间也只用很小,而且自从程序启动开始,浮动很小。为此,我又把JVM相关知识点又拿出来翻了一遍

    那么多出来的内存使用是从哪里来的?

    后来通过查询相关资料才发现,Java进程内存分为堆内存,堆外内存,堆外内存是不受JVM的GC管理的。

    堆外内存又是哪里使用到的?

    nio框架会使用到

    难道netty没有自己的一套GC机制?

    有的,但是netty的GC,只负责释放自己产生的内存,如果是使用过程中,自己创建的,是不在netty GC的范围内的。好,那么现在稳定定位到了,开始修改代码和程序启动参数。

    java -jar -Xms256M -Xmx256M -XX:MaxDirectMemorySize=128M -Dspring.profiles.active=prod log-server.jar

    -XX:MaxDirectMemorySize=128M 设置堆外内存为128M,来控制进程内存使用,并且在代码中手动 copy 出来的  ByteBuf 进行  clear (PS:后来发现这个操作不起效果,是我对于该方法的理解有误)

    @Component
    public class UDPInboundHandler extends SimpleChannelInboundHandler<DatagramPacket> {
    
        private Logger logger = LoggerFactory.getLogger(UDPInboundHandler.class);
    
        @Autowired
        LogService logService;
    
        @Override
        protected void channelRead0(ChannelHandlerContext ctx, DatagramPacket packet) {
            String remoteAddr = packet.sender().getAddress().getHostAddress();
            ByteBuf buf = packet.copy().content();
            logService.process(buf, remoteAddr);
            buf.clear();
        }
    }

    这样运行一点时间后,嗯,内存增长速度慢下来不少,原本从两百兆涨到五百兆,只需要半天时间,现在,一天观察下来,才增长到四百多兆,但是,256+128=384M,也超过了我设置的堆内存+堆外内存的总和,而且代码开始报错了如下:

    io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 134217728, max: 134217728)

    134217728 byte(s) = 128M 也就是说我的clear的操作并没有效果,堆外内存已经全部用光。OOM的报错已经刷屏,但是,在众多的异常日志中发现了这条日志

    2019-09-25 18:20:00.551 {nioEventLoopGroup-2-1} ERROR io.netty.util.ResourceLeakDetector - LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information.
    Recent access records: 
    Created at:
        io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331)
        io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
        io.netty.buffer.UnsafeByteBufUtil.copy(UnsafeByteBufUtil.java:436)
        io.netty.buffer.PooledUnsafeDirectByteBuf.copy(PooledUnsafeDirectByteBuf.java:309)
        io.netty.buffer.AbstractByteBuf.copy(AbstractByteBuf.java:1190)
        io.netty.buffer.WrappedByteBuf.copy(WrappedByteBuf.java:874)
        io.netty.channel.socket.DatagramPacket.copy(DatagramPacket.java:47)
        com.tutorgroup.base.logserver.server.UDPInboundHandler.channelRead0(UDPInboundHandler.java:24)
        com.tutorgroup.base.logserver.server.UDPInboundHandler.channelRead0(UDPInboundHandler.java:13)
        io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
        io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:93)
        io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
        io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
        io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
        io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
        io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
        io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        java.lang.Thread.run(Thread.java:748)

    ByteBuf 没有调用 release 方法,由于我的代码量比较小,项目中只有一处是用到了 ByteBuf ,所以我很快定位到了问题代码,但是如果项目很大,不知道是哪段代码导致的问题,怎么排查呢?查询相关资料后,我们再修改一下启动参数

    java -jar -Xms256M -Xmx256M -XX:MaxDirectMemorySize=2M -Dio.netty.leakDetection.level=advanced -Dio.netty.leakDetection.maxRecords=10 -Dspring.profiles.active=prod log-server.jar

    果不其然,代码又报错了,这次报错的信息很详细,已经定位到是哪个ByteBuf 变量了

    2019-09-27 10:54:24.442 {nioEventLoopGroup-2-1} ERROR io.netty.util.ResourceLeakDetector - LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information.
    Recent access records: 
    #1:
        io.netty.buffer.AdvancedLeakAwareByteBuf.readBytes(AdvancedLeakAwareByteBuf.java:496)
        com.tutorgroup.base.logserver.service.LogService.getLogJSONArray(LogService.java:108)
        com.tutorgroup.base.logserver.service.LogService.process(LogService.java:51)
        com.tutorgroup.base.logserver.server.UDPInboundHandler.channelRead0(UDPInboundHandler.java:25)
        com.tutorgroup.base.logserver.server.UDPInboundHandler.channelRead0(UDPInboundHandler.java:13)
        io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
        io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:93)
        io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
        io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
        io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
        io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
        io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
        io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        java.lang.Thread.run(Thread.java:748)
    Created at:
        io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331)
        io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
        io.netty.buffer.UnsafeByteBufUtil.copy(UnsafeByteBufUtil.java:436)
        io.netty.buffer.UnpooledUnsafeDirectByteBuf.copy(UnpooledUnsafeDirectByteBuf.java:463)
        io.netty.buffer.AbstractByteBuf.copy(AbstractByteBuf.java:1190)
        io.netty.channel.socket.DatagramPacket.copy(DatagramPacket.java:47)
        com.tutorgroup.base.logserver.server.UDPInboundHandler.channelRead0(UDPInboundHandler.java:24)
        com.tutorgroup.base.logserver.server.UDPInboundHandler.channelRead0(UDPInboundHandler.java:13)
        io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
        io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:93)
        io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
        io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
        io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
        io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
        io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
        io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        java.lang.Thread.run(Thread.java:748)

     查阅相关资料,释放ByteBuf 方式,修改代码如下

    @Component
    public class UDPInboundHandler extends SimpleChannelInboundHandler<DatagramPacket> {
    
        private Logger logger = LoggerFactory.getLogger(UDPInboundHandler.class);
    
        @Autowired
        LogService logService;
    
        @Override
        protected void channelRead0(ChannelHandlerContext ctx, DatagramPacket packet) {
            String remoteAddr = packet.sender().getAddress().getHostAddress();
            ByteBuf buf = packet.copy().content();
            try{
                logService.process(buf, remoteAddr);
                buf.clear();
            }catch (Exception e){
                logger.error(e.getMessage(),e);
            }
            finally {
                ReferenceCountUtil.release(buf);
            }
        }
    }

    ReferenceCountUtil.release()  是netty释放堆外内存的方法,加上这行代码后,问题完美解决。

    参考资料:

    http://static.muyus.com/html/3.html

    https://www.jianshu.com/p/17e72bb01bf1

  • 相关阅读:
    单元测试
    软件工程
    使用工具进行单元测试
    关于软件工程的理解
    使用Junit等工具进行单元测试
    目前对软件工程所存在的问题
    二人组-----五子棋
    使用Junit等工具进行单元测试过程记录
    对软件工程的理解以及存在的问题
    软件设计文档及数据流向图
  • 原文地址:https://www.cnblogs.com/fqybzhangji/p/11597019.html
Copyright © 2011-2022 走看看