zoukankan      html  css  js  c++  java
  • AllJoyn 的JoinSession() 返回timeout问题

    在项目中AllJoyn一直有个问题困扰着我们:client在加入session时调用JoinSession()函数有时会timeout失败。

    注意:是“有时”失败,而有时又运行的很好。这种不确定性问题向来都让人崩溃。

    在AllJoyn官方论坛上也有人提到这个问题,他们开发者说在之前的版本都已经修复了。但之后又有人遇到这种问题,并且说使用异步的JoinSessionAsync()成功几率会高点。但经我们实际测试,JoinSessionAsync()和JoinSession()没什么区别。

    前段时间的突出问题在于:DTV上运行AllJoyn,JoinSession()失败几率非常高。后来经过与AllJoyn的Sample比对代码,发现我们的LinkTimeout被设置为0了,将其改为40这个问题就暂时消失了。

    (LinkTimeout为什么被设置为0?我们都没有映象,于是翻出最早的代码,发现我写的初始版本是20,经同事重构后的版本也是20,但从HQ转一圈回来后就变为0了--重要的是我们谁都没发现这个参数被改为0了。他们为什么要改这个参数?--我们不知道,也不可能知道,因为负责这个模块的人都换了好几个。。。还有笔者更悲催的吗?)

    但最近的测试中这个问题又出现了,经过这么久的折腾我几乎束手无策,不知道该怎么解决这个问题--我都有了去仔细跟读AllJoyn庞大而复杂的源代码的冲动了。

    还好,我发现了其他办法。

    1.启用AllJoyn内部日志。

    在AllJoyn初始化前调用如下代码:

    QCC_UseOSLogging(true);
    QCC_SetLogLevels("ALLJOYN=7;ALL=1");
    QCC_SetDebugLevel("ALL", 7);
    QCC_SetDebugLevel("ICE", 1);
    QCC_SetDebugLevel("IPNS", 1);
    QCC_SetDebugLevel("TIMER", 1);
    关于这代码干啥的就不说了,翻一下AllJoyn相关源码就知道了。

    然后运行程序,在logcat中查看AllJoyn输出的日志。

    经过仔细查看,终于发现了端倪。

    JoinSession() timeout的日志:

    09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::50cc:f8ff:feac:498f
    09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
    09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss 109.123.117.49
    09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
    09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::52cc:f8ff:feac:498f
    09-06 15:26:26.608: D/NETWORK(12712): 1.910 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:171 | Socket(addrFamily = 2, type = 1, sockfd = <>)
    09-06 15:26:26.608: D/NETWORK(12712): 1.910 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:195 | Connect(sockfd = 122, remoteAddr = 109.123.117.48, remotePort = 9955)
    09-06 15:26:26.623: I/THREAD(12712): 1.925 DEBUG THREAD lepDisp common/os/posix/Thread.cc:220 | Thread function exited: lepDisp --> 0x0
    09-06 15:26:26.623: D/THREAD(12712): 1.925 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 5aba2b88 : running]
    09-06 15:26:26.623: I/THREAD(12712): 1.926 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5aba2b88] Joining thread [lepDisp - 5aba2b88]
    09-06 15:26:26.623: I/THREAD(12712): 1.926 DEBUG THREAD lepDisp common/os/posix/Thread.cc:435 | Joined thread lepDisp
    09-06 15:26:26.638: W/THREAD(12712): 1.943 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:170 | Thread::~Thread() destroying lepDisp - 0
    09-06 15:26:26.638: I/THREAD(12712): 1.943 DEBUG THREAD lepDisp common/os/posix/Thread.cc:319 | Thread::Stop() thread is dead [lepDisp]
    09-06 15:26:26.638: D/THREAD(12712): 1.943 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 0 : not running]
    09-06 15:26:26.638: I/THREAD(12712): 1.943 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5a798120] Joining thread [lepDisp - 0]
    09-06 15:26:26.638: I/THREAD(12712): 1.943 DEBUG THREAD lepDisp common/os/posix/Thread.cc:378 | Thread::Join() thread is dead [lepDisp]
    09-06 15:26:26.638: W/THREAD(12712): 1.943 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:182 | Thread::~Thread() destroyed lepDisp - 0 -- started:29 running:24 joined:5
    09-06 15:26:26.743: D/NETWORK(12712): 2.047 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:557 | RecvFrom(sockfd = 115, remoteAddr = <invalid IP address>, remotePort = 9956, buf = <>, len = 1454, received = <>)
    09-06 15:26:26.743: I/NETWORK(12712): 2.047 DEBUG NETWORK IpNameServiceImpl common/os/posix/Socket.cc:569 | Received 119 bytes, remoteAddr = 109.123.117.48, remotePort = 9956
    09-06 15:26:26.743: W/IPNS(12712): 2.047 HL_DBG IPNS IpNameServiceImpl .../IpNameServiceImpl.cc:3599 | IpNameServiceImpl::Run(): Got IPNS message from "109.123.117.48"
    09-06 15:26:26.743: I/NS(12712): 2.048 DEBUG NS IpNameServiceImpl ...on/ns/IpNsProtocol.cc:1576 | Header::Deserialize(): IsAt::Deserialize() answer 0
    09-06 15:26:26.743: I/NS(12712): 2.048 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:686 | IsAt::Deserialize()
    09-06 15:26:26.743: I/NS(12712): 2.048 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:733 | IsAt::Deserialize(): G flag 1
    09-06 15:26:26.743: I/NS(12712): 2.049 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:736 | IsAt::Deserialize(): C flag 1
    09-06 15:26:26.743: I/NS(12712): 2.049 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:739 | IsAt::Deserialize(): T flag 1

    ...


    09-06 15:27:46.413: I/NS(12712): 81.719 DEBUG NS IpNameServiceImpl ...on/ns/IpNsProtocol.cc:1034 | IsAt::Deserialize(): StringData::Deserialize() name 0
    09-06 15:27:46.413: I/NS(12712): 81.719 DEBUG NS IpNameServiceImpl ...emon/ns/IpNsProtocol.cc:81 | StringData::Deserialize()
    09-06 15:27:46.418: I/NS(12712): 81.720 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:108 | StringData::Deserialize(): com.samsung.contextware.share.acbbc6b64c8cdf53448fbaccdd46441931326180362 from buffer
    09-06 15:27:46.418: I/TCP(12712): 81.720 DEBUG TCP IpNameServiceImpl ...aemon/TCPTransport.cc:3714 | TCPTransport::FoundCallback::Found(): busAddr = "r4addr=109.123.117.48,r4port=9955"
    09-06 15:27:46.418: I/TCP(12712): 81.720 DEBUG TCP IpNameServiceImpl ...aemon/TCPTransport.cc:3792 | TCPTransport::FoundCallback::Found(): newBusAddr = "tcp:r4addr=109.123.117.48,r4port=9955".
    09-06 15:27:46.418: I/TCP(12712): 81.720 DEBUG TCP IpNameServiceImpl ...aemon/TCPTransport.cc:3798 | TCPTransport::FoundCallback::Found(): FoundNames(): tcp:r4addr=109.123.117.48,r4port=9955
    09-06 15:27:46.418: D/ALLJOYN_OBJ(12712): 81.721 TRACE ALLJOYN_OBJ IpNameServiceImpl .../daemon/AllJoynObj.cc:3495 | AllJoynObj::FoundNames(busAddr = "tcp:r4addr=109.123.117.48,r4port=9955", guid = "7e08ad9db1299e3a505206188e09b165", names = com.samsung.contextware.share.acbbc6b64c8cdf53448fbaccdd46441931326180362, ttl = 120)
    09-06 15:27:46.418: D/THREAD(12712): 81.721 TRACE THREAD IpNameServiceImpl common/os/posix/Thread.cc:333 | Thread::Alert() [NameReaper: running]
    09-06 15:27:52.183: I/IFCONFIG(12712): 87.489 DEBUG IFCONFIG IpNameServiceImpl ...posix/IfConfigLinux.cc:563 | IfConfig(): The Linux way
    09-06 15:27:52.193: D/NETWORK(12712): 87.495 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:171 | Socket(addrFamily = 2, type = 2, sockfd = <>)
    09-06 15:27:52.193: D/NETWORK(12712): 87.496 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:272 | Bind(sockfd = 115, localAddr = 0.0.0.0, localPort = 9956)
    09-06 15:27:52.198: D/NETWORK(12712): 87.500 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:171 | Socket(addrFamily = 10, type = 2, sockfd = <>)
    09-06 15:27:52.198: D/NETWORK(12712): 87.501 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:272 | Bind(sockfd = 116, localAddr = ::, localPort = 9956)
    09-06 15:27:56.683: I/LOCAL_TRANSPORT(12712): 91.986 DEBUG LOCAL_TRANSPORT replyTimer .../src/LocalTransport.cc:841 | Timed out waiting for METHOD_REPLY with serial 5
    09-06 15:27:56.683: W/ALLJOYN(12712): 91.987 HL_DBG ALLJOYN replyTimer ...ore/src/Message_Gen.cc:988 | MarshalMessage: 80+0 ERROR[5] org.alljoyn.Bus.Timeout
    09-06 15:27:56.683: D/THREAD(12712): 91.987 TRACE THREAD replyTimer common/os/posix/Thread.cc:333 | Thread::Alert() [lepDisp: running]
    09-06 15:27:56.688: W/THREAD(12712): 91.991 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:164 | Thread::Thread() created lepDisp - 0 -- started:29 running:24 joined:5
    09-06 15:27:56.688: D/THREAD(12712): 91.993 TRACE THREAD lepDisp common/os/posix/Thread.cc:301 | Thread::Start() [lepDisp] pid = 5cbd8918
    09-06 15:27:56.688: I/LOCAL_TRANSPORT(12712): 91.993 DEBUG LOCAL_TRANSPORT lepDisp .../src/LocalTransport.cc:465 | Pushing ERROR[5] org.alljoyn.Bus.Timeout into local endpoint
    09-06 15:27:56.688: I/LOCAL_TRANSPORT(12712): 91.993 DEBUG LOCAL_TRANSPORT lepDisp .../src/LocalTransport.cc:703 | LocalEndpoint::RemoveReplyHandler for serial=5
    09-06 15:27:56.688: I/LOCAL_TRANSPORT(12712): 91.993 DEBUG LOCAL_TRANSPORT lepDisp ...src/LocalTransport.cc:1018 | Matched reply for serial #5
    09-06 15:27:56.688: I/THREAD(12712): 91.993 DEBUG THREAD external common/os/posix/Thread.cc:205 | Thread::RunInternal: lepDisp (pid=5cbd8918)
    09-06 15:27:56.688: I/THREAD(12712): 91.994 DEBUG THREAD lepDisp common/os/posix/Thread.cc:216 | Starting thread: lepDisp
    09-06 15:27:56.693: I/ALLJOYN(12712): 91.994 DEBUG ALLJOYN lepDisp ...e/src/Message_Parse.cc:647 | Unmarshaled
    09-06 15:27:56.693: I/ALLJOYN(12712): <message endianness="LITTLE" type="ERROR" version="1" body_len="0" serial="6">
    09-06 15:27:56.693: I/ALLJOYN(12712): <header_fields>
    09-06 15:27:56.693: I/ALLJOYN(12712): <header field="ERROR_NAME">
    09-06 15:27:56.693: I/ALLJOYN(12712): <string>org.alljoyn.Bus.Timeout</string>
    09-06 15:27:56.693: I/ALLJOYN(12712): </header>
    09-06 15:27:56.693: I/ALLJOYN(12712): <header field="REPLY_SERIAL">
    09-06 15:27:56.693: I/ALLJOYN(12712): <uint32>5</uint32>
    09-06 15:27:56.693: I/ALLJOYN(12712): </header>
    09-06 15:27:56.693: I/ALLJOYN(12712): <header field="SENDER">
    09-06 15:27:56.693: I/ALLJOYN(12712): <string>:_Eds_BwS.2</string>
    09-06 15:27:56.693: I/ALLJOYN(12712): </header>
    09-06 15:27:56.693: I/ALLJOYN(12712): </header_fields>
    09-06 15:27:56.693: I/ALLJOYN(12712): </message>
    09-06 15:27:56.693: E/ALLJOYN(12712): 91.996 ****** ERROR ALLJOYN lepDisp .../src/BusAttachment.cc:1477 | org.alljoyn.Bus.JoinSession returned ERROR_MESSAGE (error=org.alljoyn.Bus.Timeout): ER_BUS_REPLY_IS_ERROR_MESSAGE
    09-06 15:27:56.693: D/THREAD(12712): 91.996 TRACE THREAD lepDisp common/os/posix/Thread.cc:333 | Thread::Alert() [lepDisp: running]
    09-06 15:27:56.693: V/JACK(12712): [ContextSharing]AllJoyn::_join_session(): join session:com.samsung.contextware.share.acbbc6b64c8cdf53448fbaccdd46441931326180362 failed.ER_BUS_REPLY_IS_ERROR_MESSAGE

    ...


    09-06 15:29:31.303: I/RENDEZVOUS_SERVER_CONNECTION(12712): 186.606 DEBUG RENDEZVOUS_SERVER_CONNECTION DiscoveryManager ...tion.cc:72 | RendezvousServerConnection::~RendezvousServerConnection()
    09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.606 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:1158 | Run: Server connect return status = ER_UNABLE_TO_CONNECT_TO_RENDEZVOUS_SERVER
    09-06 15:29:31.303: D/PROXIMITY_SCAN_ENGINE(12712): 186.607 TRACE PROXIMITY_SCAN_ENGINE DiscoveryManager ...ScanEngine.cc:320 | ProximityScanEngine::StopScan() called
    09-06 15:29:31.303: I/PROXIMITY_SCAN_ENGINE(12712): 186.607 DEBUG PROXIMITY_SCAN_ENGINE DiscoveryManager ...ScanEngine.cc:336 | ProximityScanEngine::StopScan() completed
    09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.607 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:3267 | DiscoveryManager::GetWaitTimeOut()
    09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.607 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:3273 | DiscoveryManager::GetWaitTimeOut(): timeout= 0xffffffff tNow = 0x2d8ef
    09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.607 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:3305 | DiscoveryManager::GetWaitTimeOut(): timeout = -1
    09-06 15:29:36.218: E/NETWORK(12712): 191.520 ****** ERROR NETWORK JoinS-1 common/os/posix/Socket.cc:214 | Connecting (sockfd = 122) to 109.123.117.48 9955: 110 - Connection timed out: ER_OS_ERROR
    09-06 15:29:36.218: E/TCP(12712): 191.520 ****** ERROR TCP JoinS-1 ...aemon/TCPTransport.cc:2760 | TCPTransport::Connect(): Failed: ER_OS_ERROR
    09-06 15:29:36.218: I/ALLJOYN(12712): 191.521 DEBUG ALLJOYN JoinS-1 ...core/src/BusEndpoint.cc:45 | Invalidating endpoint type=0

    JoinSesssion() 成功的日志:

    09-06 17:03:39.428: W/TCP(7411): 1.731 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::50cc:f8ff:feac:498f
    09-06 17:03:39.428: W/TCP(7411): 1.731 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
    09-06 17:03:39.428: W/TCP(7411): 1.732 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss 109.123.117.49
    09-06 17:03:39.428: W/TCP(7411): 1.732 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
    09-06 17:03:39.428: W/TCP(7411): 1.732 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::52cc:f8ff:feac:498f
    09-06 17:03:39.428: D/NETWORK(7411): 1.732 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:171 | Socket(addrFamily = 2, type = 1, sockfd = <>)
    09-06 17:03:39.428: D/NETWORK(7411): 1.733 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:195 | Connect(sockfd = 120, remoteAddr = 109.123.117.129, remotePort = 9955)
    09-06 17:03:39.438: I/THREAD(7411): 1.742 DEBUG THREAD lepDisp common/os/posix/Thread.cc:220 | Thread function exited: lepDisp --> 0x0
    09-06 17:03:39.438: D/THREAD(7411): 1.742 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 5cbd2f38 : running]
    09-06 17:03:39.438: I/THREAD(7411): 1.743 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5cbd2f38] Joining thread [lepDisp - 5cbd2f38]
    09-06 17:03:39.438: I/THREAD(7411): 1.743 DEBUG THREAD lepDisp common/os/posix/Thread.cc:435 | Joined thread lepDisp
    09-06 17:03:39.448: W/THREAD(7411): 1.754 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:170 | Thread::~Thread() destroying lepDisp - 0
    09-06 17:03:39.448: I/THREAD(7411): 1.754 DEBUG THREAD lepDisp common/os/posix/Thread.cc:319 | Thread::Stop() thread is dead [lepDisp]
    09-06 17:03:39.448: D/THREAD(7411): 1.754 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 0 : not running]
    09-06 17:03:39.448: I/THREAD(7411): 1.754 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5aba3450] Joining thread [lepDisp - 0]
    09-06 17:03:39.448: I/THREAD(7411): 1.754 DEBUG THREAD lepDisp common/os/posix/Thread.cc:378 | Thread::Join() thread is dead [lepDisp]
    09-06 17:03:39.448: W/THREAD(7411): 1.754 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:182 | Thread::~Thread() destroyed lepDisp - 0 -- started:28 running:24 joined:4
    09-06 17:03:39.498: D/NETWORK(7411): 1.802 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:467 | Send(sockfd = 120, *buf = <>, len = 1, sent = <>)
    09-06 17:03:39.498: W/THREAD(7411): 1.802 HL_DBG THREAD JoinS-1 common/os/posix/Thread.cc:164 | Thread::Thread() created auth - 0 -- started:28 running:24 joined:4
    09-06 17:03:39.498: I/ALLJOYN(7411): 1.802 DEBUG ALLJOYN JoinS-1 ...re/src/EndpointAuth.cc:370 | EndpointAuth::Establish authMechanisms="ANONYMOUS"
    09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.802 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:699 | SASL Responder mechanisms ANONYMOUS
    09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:275 | Responder starting auth conversation ANONYMOUS
    09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:294 | Current authSet ANONYMOUS
    09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:220 | Initialized authMechanism ANONYMOUS
    09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:224 | New Responder state WAIT_FOR_DATA
    09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:452 | Responder sending AUTH ANONYMOUS
    09-06 17:03:39.498: D/NETWORK(7411): 1.803 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:467 | Send(sockfd = 120, *buf = <>, len = 16, sent = <>)
    09-06 17:03:39.498: I/ALLJOYN(7411): 1.803 DEBUG ALLJOYN JoinS-1 ...re/src/EndpointAuth.cc:433 | Sent AUTH ANONYMOUS
    09-06 17:03:39.498: D/NETWORK(7411): 1.803 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)
    09-06 17:03:39.568: D/NETWORK(7411): 1.869 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)
    09-06 17:03:39.568: D/NETWORK(7411): 1.870 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)
    09-06 17:03:39.568: D/NETWORK(7411): 1.870 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)

    相关代码如下:

    QStatus TCPTransport::Connect(const char* connectSpec, const SessionOpts& opts, BusEndpoint& newEp)
    {
    QCC_DbgHLPrintf(("TCPTransport::Connect(): %s", connectSpec));

    QStatus status;
    bool isConnected = false;

    ...


    SocketFd sockFd = -1;
    status = Socket(QCC_AF_INET, QCC_SOCK_STREAM, sockFd);
    if (status == ER_OK) {
    /* Turn off Nagle */
    status = SetNagle(sockFd, false);
    }

    if (status == ER_OK) {
    /*
    * We got a socket, now tell TCP to connect to the remote address and
    * port.
    */
    status = qcc::Connect(sockFd, ipAddr, port);
    if (status == ER_OK) {
    /*
    * We now have a TCP connection established, but DBus (the wire
    * protocol which we are using) requires that every connection,
    * irrespective of transport, start with a single zero byte. This
    * is so that the Unix-domain socket transport used by DBus can pass
    * SCM_RIGHTS out-of-band when that byte is sent.
    */
    uint8_t nul = 0;
    size_t sent;

    status = Send(sockFd, &nul, 1, sent);
    if (status != ER_OK) {
    QCC_LogError(status, ("TCPTransport::Connect(): Failed to send initial NUL byte"));
    }
    isConnected = true;
    } else {
    QCC_LogError(status, ("TCPTransport::Connect(): Failed"));
    }
    } else {
    QCC_LogError(status, ("TCPTransport::Connect(): qcc::Socket() failed"));
    }

    }

    QStatus Connect(SocketFd sockfd, const IPAddress& remoteAddr, uint16_t remotePort)
    {
    QStatus status = ER_OK;
    int ret;
    struct sockaddr_storage addr;
    socklen_t addrLen = sizeof(addr);

    QCC_DbgTrace(("Connect(sockfd = %d, remoteAddr = %s, remotePort = %hu)",
    sockfd, remoteAddr.ToString().c_str(), remotePort));

    status = MakeSockAddr(remoteAddr, remotePort, &addr, addrLen);
    if (status != ER_OK) {
    return status;
    }

    ret = connect(static_cast<int>(sockfd), reinterpret_cast<struct sockaddr*>(&addr), addrLen);
    if (ret == -1) {
    if ((errno == EINPROGRESS) || (errno == EALREADY)) {
    status = ER_WOULDBLOCK;
    } else if (errno == EISCONN) {
    status = ER_OK;
    } else if (errno == ECONNREFUSED) {
    status = ER_CONN_REFUSED;
    } else {
    status = ER_OS_ERROR;
    QCC_LogError(status, ("Connecting (sockfd = %u) to %s %d: %d - %s", sockfd,
    remoteAddr.ToString().c_str(), remotePort,
    errno, strerror(errno)));
    }
    } else {
    int flags = fcntl(sockfd, F_GETFL, 0);
    ret = fcntl(sockfd, F_SETFL, flags | O_NONBLOCK);

    if (ret == -1) {
    status = ER_OS_ERROR;
    QCC_LogError(status, ("Connect fcntl (sockfd = %u) to O_NONBLOCK: %d - %s", sockfd, errno, strerror(errno)));
    /* Higher level code is responsible for closing the socket */
    }
    }

    return status;
    }

    根据日志做了一个简单的流程对比:

    时间(s) JoinSession() timeout JoinSession()成功
    0
    开始连接
    Connect(sockfd = 122, remoteAddr = 109.123.117.48, remotePort = 9955)
    开始连接

    Connect(sockfd = 120, remoteAddr = 109.123.117.129, remotePort = 9955)
    0
    连接成功,发送数据
    Send(sockfd = 120, *buf = <>, len = 1, sent = <>)
    0
    接收数据
    Recv(sockfd = 120, buf = <>, len = 1, received = <>)
    90
    JoinSession() timeout
    LocalTransport.cc:841 | Timed out waiting for METHOD_REPLY with serial 5
    BusAttachment.cc:1477 | org.alljoyn.Bus.JoinSession returned ERROR_MESSAGE
    (error=org.alljoyn.Bus.Timeout): ER_BUS_REPLY_IS_ERROR_MESSAGE

    190 tcp connect() timeout
    Connecting (sockfd = 122) to 109.123.117.48 9955: 110 - Connection timed out: ER_OS_ERROR

    通过上面对比,可以发现JoinSession() timeout的根源在于tcp的系统调用connect阻塞超时了,即如下代码:

    ret = connect(static_cast<int>(sockfd), reinterpret_cast<struct sockaddr*>(&addr), addrLen);

    找到问题点了,可为什么会timeout呢?根据一般的tcp编程经验,很有可能是网络或者Server端出问题了。

    会有什么问题呢?不知道。

    刚在Android上安装好了tcpdump,下周抓包一看就知道啦!

  • 相关阅读:
    “不裁员”才是公司度过艰难时期的聪明选择
    oracle服务端与客户端字符集不同导致中文乱码解决方案
    [转]ABAP数据库操作系列(5)
    [转]ABAP数据库操作系列(2)
    [转]整理的函数
    [转]ABAP数据库操作系列(6)
    [转]ABAP数据库操作系列(7)
    [转]Authoritycheck
    [转]ABAP Program to Display SAP Icons
    [转]ABAP数据库操作系列(3)
  • 原文地址:https://www.cnblogs.com/chutianyao/p/3312811.html
Copyright © 2011-2022 走看看