首先,澄清一个误区
upstream_response_time必须在upstream配置时才能使用?
答案: 否。
举例:
request_time
官网描述:request processing time in seconds with a milliseconds resolution; time elapsed between the first bytes were read from the client and the log write after the last bytes were sent to the client。
指的就是从接受用户请求的第一个字节到发送完响应数据的时间,即$request_time包括接收客户端请求数据的时间、后端程序响应的时间、发送响应数据给客户端的时间(不包含写日志的时间)。
官方文档:http://nginx.org/en/docs/http/ngx_http_log_module.html
upstream_response_time
官网描述:keeps time spent on receiving the response from the upstream server; the time is kept in seconds with millisecond resolution. Times of several responses are separated by commas and colons like addresses in the $upstream_addr variable.。
是指从Nginx向后端建立连接开始到接受完数据然后关闭连接为止的时间。
从上面的描述可以看出,$request_time肯定比$upstream_response_time值大;尤其是在客户端采用POST方式提交较大的数据,响应体比较大的时候。在客户端网络条件差的时候,$request_time还会被放大。
官方文档:http://nginx.org/en/docs/http/ngx_http_upstream_module.html
“other” times
除了上述的request_time和upstream_response_time比较常用,在新的Nginx版本中对整个请求各个处理阶段的耗时做了近一步的细分:
$upstream_connect_time(1.9.1):
keeps time spent on establishing a connection with the upstream server (1.9.1); the time is kept in seconds with millisecond resolution. In case of SSL, includes time spent on handshake. Times of several connections are separated by commas and colons like addresses in the $upstream_addr variable.
跟后端server建立连接的时间,如果是到后端使用了加密的协议,该时间将包括握手的时间。
$upstream_header_time(1.7.10):
keeps time spent on receiving the response header from the upstream server (1.7.10); the time is kept in seconds with millisecond resolution. Times of several responses are separated by commas and colons like addresses in the $upstream_addr variable.
接收后端server响应头的时间。
另外,Why is request_time much larger than upstream_response_time in nginx access.log? 的也证实了这点:
如果把整个过程补充起来的话 应该是:
[1用户请求][2建立 Nginx 连接][3发送响应][4接收响应][5关闭 Nginx 连接]
upstream_response_time
那么 upstream_response_time 就是: 2+3+4+5
但是,一般这里面可以认为 [5关闭 Nginx 连接] 的耗时接近 0
所以 upstream_response_time 实际上就是: 2+3+4
request_time
request_time 是:1+2+3+4
二者之间相差的就是 [1用户请求] 的时间
问题分析
出现问题原因汇总:
- 用户端网络状况较差
- 传递数据本身较大
- 当使用 POST 方式传参时 Nginx 会先把 request body 缓存起来
这些耗时都会累积到 [1用户请求] 头上去
这样就解释了为什么 request_time 有可能会比 upstream_response_time 要大
总结
因为用户端的状况通常千差万别 无法控制,所以并不应该被纳入到测试和调优的范畴里面
更值得关注的应该是 upstream_response_time,所以在实际工作中 如果想要关心哪些请求比较慢的话,记得要在配置文件的 log_format 中加入 $upstream_response_time 。