使用Curl 命令时候发现一个不确定Hung的问题, 一天可能出现一次这样的问题。 打开Curl的log 如下:
[TEXT] Trying 10.45.65.22...
[TEXT]TCP_NODELAY set
[TEXT]Connected to 10.45.65.22 (10.45.65.22) port 8080 (#0)
[HEADER_OUT]POST xxx
Host: xxx:8080
Accept: */*
Content-Type: application/x-www-form-urlencoded
Authorization: Basic xxx
Expect: 100-continue
[TEXT]Done waiting for 100-continue
^C[DATA_OUT] ---------------Ctrl+c USED HERE after hung more than 3 minutes(接收消息timeout时间设置的3Minutes)
[TEXT]Operation timed out after 1095729 milliseconds with 0 bytes received
分析:
1. 查看当时的stack状态:
找到进程ID:ps -ef|grep xxx
查看Stack:pstack processID
查看Ltrack: ltrace -p processID
查看TCP 端口:lsof |grep processID |grep TCP 端口状态:CLOSE_WAIT
2. 原因:因为接收100-continue的默认时间是1 second, 当服务器的返回大于1s时问题就出现了!
3. 解决办法:
curl_easy_setopt(curl, CURLOPT_FAILONERROR, 1) != CURLE_OK || ---------Timeout for 100-continue happens then exit
curl_easy_setopt(curl, CURLOPT_EXPECT_100_TIMEOUT_MS, 60000) != CURLE_OK || --------------enlarge timeout for get 100-continue reply message