zoukankan      html  css  js  c++  java
  • 大多数人可能都不会使用socketTimeout,看了底层才知道一直都做错了

      前几天一个机房网络抖动,引发了很多对外请求的超时问题,在发生问题排查日志的时候,发现了这么一个现象,httpclient我们的请求超时时间并没有按照我们的设置报超时异常

    我们的大概配置如下:

    RequestConfig requestConfig = RequestConfig.custom()
                    .setConnectTimeout(1000)
                    .setSocketTimeout(2000)
                    .setConnectionRequestTimeout(1000)
                    .build();

         但实际却发现很多请求超时时间都达到了10几秒甚至有的二十几秒,大大超过了我们的预期时间,决定通过跟踪源码一探究竟:

         原来http读取网络数据的时候是其实是使用的BufferedReader类,而我们知道java的io类其实都是对基本输入流的装饰,其底层其实是利用的SocketInputStream来读取数据,一路代码跟踪,我们跟踪到了这个方法

    int read(byte b[], int off, int length, int timeout) throws IOException {
            int n = 0;
    
            // EOF already encountered
            if (eof) {
                return -1;
            }
    
            // connection reset
            if (impl.isConnectionReset()) {
                throw new SocketException("Connection reset");
            }
    
            // bounds check
            if (length <= 0 || off < 0 || off + length > b.length) {
                if (length == 0) {
                    return 0;
                }
                throw new ArrayIndexOutOfBoundsException();
            }
    
            boolean gotReset = false;
    
            Object traceContext = IoTrace.socketReadBegin();
            // acquire file descriptor and do the read
            FileDescriptor fd = impl.acquireFD();
            try {
                n = socketRead0(fd, b, off, length, timeout);
                if (n > 0) {
                    return n;
                }
            } catch (ConnectionResetException rstExc) {
                gotReset = true;
            } finally {
                impl.releaseFD();
                IoTrace.socketReadEnd(traceContext, impl.address, impl.port,
                                      timeout, n > 0 ? n : 0);
            }
    
            /*
             * We receive a "connection reset" but there may be bytes still
             * buffered on the socket
             */
            if (gotReset) {
                traceContext = IoTrace.socketReadBegin();
                impl.setConnectionResetPending();
                impl.acquireFD();
                try {
                    n = socketRead0(fd, b, off, length, timeout);
                    if (n > 0) {
                        return n;
                    }
                } catch (ConnectionResetException rstExc) {
                } finally {
                    impl.releaseFD();
                    IoTrace.socketReadEnd(traceContext, impl.address, impl.port,
                                          timeout, n > 0 ? n : 0);
                }
            }
    
            /*
             * If we get here we are at EOF, the socket has been closed,
             * or the connection has been reset.
             */
            if (impl.isClosedOrPending()) {
                throw new SocketException("Socket closed");
            }
            if (impl.isConnectionResetPending()) {
                impl.setConnectionReset();
            }
            if (impl.isConnectionReset()) {
                throw new SocketException("Connection reset");
            }
            eof = true;
            return -1;
        }

    这个方法的核心其实就是 socketRead0(fd, b, off, length, timeout)这个方法的调用,而这个方法是这样的:

    private native int socketRead0(FileDescriptor fd,
                                       byte b[], int off, int len,
                                       int timeout)
            throws IOException;

    这个是native方法,通过下载openjdk1.8源码,我们在openjdkjdksrcsolaris ativejava et的目录下找到了相关实现,在SocketInputStream.c文件里,代码如下:

    Java_java_net_SocketInputStream_socketRead0(JNIEnv *env, jobject this,
                                                jobject fdObj, jbyteArray data,
                                                jint off, jint len, jint timeout)
    {
        char BUF[MAX_BUFFER_LEN];
        char *bufP;
        jint fd, nread;
    
        if (IS_NULL(fdObj)) {
            /* shouldn't this be a NullPointerException? -br */
            JNU_ThrowByName(env, JNU_JAVANETPKG "SocketException",
                            "Socket closed");
            return -1;
        } else {
            fd = (*env)->GetIntField(env, fdObj, IO_fd_fdID);
            /* Bug 4086704 - If the Socket associated with this file descriptor
             * was closed (sysCloseFD), then the file descriptor is set to -1.
             */
            if (fd == -1) {
                JNU_ThrowByName(env, "java/net/SocketException", "Socket closed");
                return -1;
            }
        }
    
        /*
         * If the read is greater than our stack allocated buffer then
         * we allocate from the heap (up to a limit)
         */
        if (len > MAX_BUFFER_LEN) {
            if (len > MAX_HEAP_BUFFER_LEN) {
                len = MAX_HEAP_BUFFER_LEN;
            }
            bufP = (char *)malloc((size_t)len);
            if (bufP == NULL) {
                bufP = BUF;
                len = MAX_BUFFER_LEN;
            }
        } else {
            bufP = BUF;
        }
    
        if (timeout) {
            nread = NET_Timeout(fd, timeout);
            if (nread <= 0) {
                if (nread == 0) {
                    JNU_ThrowByName(env, JNU_JAVANETPKG "SocketTimeoutException",
                                "Read timed out");
                } else if (nread == JVM_IO_ERR) {
                    if (errno == EBADF) {
                         JNU_ThrowByName(env, JNU_JAVANETPKG "SocketException", "Socket closed");
                     } else if (errno == ENOMEM) {
                         JNU_ThrowOutOfMemoryError(env, "NET_Timeout native heap allocation failed");
                     } else {
                         NET_ThrowByNameWithLastError(env, JNU_JAVANETPKG "SocketException",
                                                      "select/poll failed");
                     }
                } else if (nread == JVM_IO_INTR) {
                    JNU_ThrowByName(env, JNU_JAVAIOPKG "InterruptedIOException",
                                "Operation interrupted");
                }
                if (bufP != BUF) {
                    free(bufP);
                }
                return -1;
            }
        }
    
        nread = NET_Read(fd, bufP, len);
    
        if (nread <= 0) {
            if (nread < 0) {
    
                switch (errno) {
                    case ECONNRESET:
                    case EPIPE:
                        JNU_ThrowByName(env, "sun/net/ConnectionResetException",
                            "Connection reset");
                        break;
    
                    case EBADF:
                        JNU_ThrowByName(env, JNU_JAVANETPKG "SocketException",
                            "Socket closed");
                        break;
    
                    case EINTR:
                         JNU_ThrowByName(env, JNU_JAVAIOPKG "InterruptedIOException",
                               "Operation interrupted");
                         break;
    
                    default:
                        NET_ThrowByNameWithLastError(env,
                            JNU_JAVANETPKG "SocketException", "Read failed");
                }
            }
        } else {
            (*env)->SetByteArrayRegion(env, data, off, nread, (jbyte *)bufP);
        }
    
        if (bufP != BUF) {
            free(bufP);
        }
        return nread;
    }

    通过代码我们可以知道,数据的读取是通过NET_Timeout (fd, timeout)来实现的,我们继续跟踪代码,在linux_close.c文件中,发现了NET_Timeout的实现:

    int NET_Timeout(int s, long timeout) {
        long prevtime = 0, newtime;
        struct timeval t;
        fdEntry_t *fdEntry = getFdEntry(s);
    
        /*
         * Check that fd hasn't been closed.
         */
        if (fdEntry == NULL) {
            errno = EBADF;
            return -1;
        }
    
        /*
         * Pick up current time as may need to adjust timeout
         */
        if (timeout > 0) {
            gettimeofday(&t, NULL);
            prevtime = t.tv_sec * 1000  +  t.tv_usec / 1000;
        }
    
        for(;;) {
            struct pollfd pfd;
            int rv;
            threadEntry_t self;
    
            /*
             * Poll the fd. If interrupted by our wakeup signal
             * errno will be set to EBADF.
             */
            pfd.fd = s;
            pfd.events = POLLIN | POLLERR;
    
            startOp(fdEntry, &self);
            rv = poll(&pfd, 1, timeout);
            endOp(fdEntry, &self);
    
            /*
             * If interrupted then adjust timeout. If timeout
             * has expired return 0 (indicating timeout expired).
             */
            if (rv < 0 && errno == EINTR) {
                if (timeout > 0) {
                    gettimeofday(&t, NULL);
                    newtime = t.tv_sec * 1000  +  t.tv_usec / 1000;
                    timeout -= newtime - prevtime;
                    if (timeout <= 0) {
                        return 0;
                    }
                    prevtime = newtime;
                }
            } else {
                return rv;
            }
    
        }
    }

    代码中的关键点在 poll(&pfd, 1, timeout);poll是linux中的字符设备驱动中的一个函数,作用是把当前的文件指针挂到设备内部定义的等待

    这样就很好理解了,其实这个时间是我两次读取数据之间的最长阻塞时间,如果我在网络抖动的情况下,我每次2秒之内返回一部分数据,这样我就一直不会超时了,为了验证我们的理解写了test,代码如下,一个controller,用来接受http请求:

    @org.springframework.stereotype.Controller
    @RequestMapping("/hello")
    public class Controller {
        @RequestMapping("/test")
        public void tets(HttpServletRequest request ,HttpServletResponse response) throws IOException, InterruptedException {
            System.out.println("I'm coming");
            PrintWriter writer = response.getWriter();
            while (true){
                writer.print("ha ha ha");
                writer.flush();
                Thread.sleep(2000);
                System.out.println("I'm ha ha ha");
            }
        }
    }

    这个代码就是每隔2s发送一条数据,循环发送,模拟网络不好的时候,收到的数据断断续续,再来一个test用来发送请求:

     @Test
        public void tetsHttpClientHttp() throws IOException {
            RequestConfig requestConfig = RequestConfig.custom()
                    .setConnectTimeout(1000)
                    .setSocketTimeout(3000)
                    .setConnectionRequestTimeout(1000)
                    .build();
    
            CloseableHttpClient httpClient = HttpClientBuilder.create().setDefaultRequestConfig(requestConfig).build();
            // 创建Get请求
            HttpGet httpGet = new HttpGet("http://127.0.0.1:8080/hello/test");
            CloseableHttpResponse response =httpClient.execute(httpGet);
            HttpEntity responseEntity = response.getEntity();
            if (responseEntity != null) {
                System.out.println("响应内容为:" + EntityUtils.toString(responseEntity));
            }
        }

    服务端结果如下:

    客户端结果如下:

    程序并没有如期抛出异常,和我们预想的一样,而当我们修改socketTimeout为1000时,经验证可以抛出java.net.SocketTimeoutException: Read timed out 异常

    为此,为了更准确控制时间,我们需要自己实现超时机制:

    ExecutorService executor = Executors.newFixedThreadPool(1);
            Callable<String> callable = new Callable<String>() {
                @Override
                public String call() throws Exception {
                    RequestConfig requestConfig = RequestConfig.custom()
                            .setConnectTimeout(1000)
                            .setSocketTimeout(3000)
                            .setConnectionRequestTimeout(1000)
                            .build();
    
                    CloseableHttpClient httpClient = HttpClientBuilder.create().setDefaultRequestConfig(requestConfig).build();
                    // 创建Get请求
                    HttpGet httpGet = new HttpGet("http://127.0.0.1:8080/hello/test");
                    CloseableHttpResponse response =httpClient.execute(httpGet);
                    HttpEntity responseEntity = response.getEntity();
                    return EntityUtils.toString(responseEntity);
                }
            };
            Future<String> future = executor.submit(callable);
            System.out.print(future.get(5,TimeUnit.SECONDS));

    这样就可以避免这种情况,在请求线程超时时抛出 java.util.concurrent.TimeoutException避免长时间占住业务线程影响我们的服务,当然这只是个例子,现实我们可能还要考虑线程数,拒绝策略等情况。

  • 相关阅读:
    漫谈AOP开发之初探AOP及AspectJ的用法
    Spring AOP那些学术概念—通知、增强处理连接点(JoinPoint)切面(Aspect)
    深入理解Java虚拟机:OutOfMemory实战
    Java内存泄漏分析与解决方案
    漂亮回答面试官struts2的原理
    [小知识]不显示没有内容的UITableViewCell
    initWithCoder: 与initWithFrame:
    pageControl设置不居中显示,居左或居右
    Xcode中报错或警告信息整理,持续更新...
    [小技巧]阀值的使用
  • 原文地址:https://www.cnblogs.com/chengxiansheng/p/13232686.html
Copyright © 2011-2022 走看看