zoukankan      html  css  js  c++  java
  • "crsctl check crs" command hangs at EVMD check

    
    Pre-11gR2: "crsctl check crs" command hangs at EVMD check (文档 ID 1578875.1)

    APPLIES TO:

    Oracle Database - Enterprise Edition - Version 10.2.0.3 to 11.1.0.7 [Release 10.2 to 11.1]
    Information in this document applies to any platform.
    SYMPTOMS

    In a 2 node RAC environment, with 11.1.0.7 CRS, execution of the command "crsctl check crs" hangs at EVMD check only in Node 1

    [oracle@srv03401 bin]$ ./crsctl check crs
    Cluster Synchronization Services appears healthy
    Cluster Ready Services appears healthy

    From Node1, below is the output of strace for the command "crsctl check crs"

    # strace -f -t -o /tmp/crschk.trc crsctl check crs

    Content of the generated output file :/tmp/crschk.trc is as follows:  

    28268 11:47:03 execve("./crsctl", ["./crsctl", "check", "crs"], [/* 23 vars */]) = 0
    28268 11:47:03 brk(0)                   = 0x193d2000
    28268 11:47:03 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b35b9436000
    28268 11:47:03 uname({sys="Linux", node="srv03401.metra.com", ...}) = 0
    28268 11:47:03 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
    28268 11:47:03 open("/etc/ld.so.cache", O_RDONLY) = 3
    28268 11:47:03 fstat(3, {st_mode=S_IFREG|0644, st_size=92563, ...}) = 0
    28268 11:47:03 mmap(NULL, 92563, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2b35b9437000
    28268 11:47:03 close(3)                 = 0
    28268 11:47:03 open("/lib64/libtermcap.so.2", O_RDONLY) = 3
    28268 11:47:03 read(3, "177ELF2113>1`20300z2"..., 832) = 832
    28268 11:47:03 fstat(3, {st_mode=S_IFREG|0755, st_size=15840, ...}) = 0
    28268 11:47:03 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b35b944e000
    28268 11:47:03 mmap(0x327ac00000, 2108944, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x327ac00000
    28268 11:47:03 mprotect(0x327ac03000, 2093056, PROT_NONE) = 0
    28268 11:47:03 mmap(0x327ae02000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x327ae02000
    28268 11:47:03 close(3)                 = 0
    28268 11:47:03 open("/lib64/libdl.so.2", O_RDONLY) = 3
    ..
    ..
    28268 11:47:03 close(3)                 = 0
    28268 11:47:03 write(1, "Cluster Ready Services appears h"..., 39) = 39
    28268 11:47:03 socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3
    28268 11:47:03 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
    28268 11:47:03 bind(3, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
    28268 11:47:03 getsockname(3, {sa_family=AF_INET6, sin6_port=htons(42027), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [140733193388060]) = 0
    28268 11:47:03 getpeername(3, 0x7fff5f19e1e0, [140733193388060]) = -1 ENOTCONN (Transport endpoint is not connected)
    28268 11:47:03 getsockopt(3, SOL_SOCKET, SO_SNDBUF, [5536382933839118336], [4]) = 0
    28268 11:47:03 getsockopt(3, SOL_SOCKET, SO_RCVBUF, [5536382933843050496], [4]) = 0
    28268 11:47:03 fcntl(3, F_SETFD, FD_CLOEXEC) = 0
    28268 11:47:03 fcntl(3, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
    28268 11:47:03 geteuid()                = 700
    28268 11:47:03 times({tms_utime=1, tms_stime=2, tms_cutime=0, tms_cstime=0}) = 7422615891
    28268 11:47:03 socket(PF_FILE, SOCK_STREAM, 0) = 4
    28268 11:47:03 access("/var/tmp/.oracle/sSYSTEM.evm.acceptor.auth", F_OK) = 0
    28268 11:47:03 connect(4, {sa_family=AF_FILE, path="/var/tmp/.oracle/sSYSTEM.evm.acceptor.auth"...}, 110
      

    CAUSE

    Analysing  the strace output, looks like it was trying to write to a socket.

    ========
    28268 11:47:03 socket(PF_FILE, SOCK_STREAM, 0) = 4
    28268 11:47:03 access("/var/tmp/.oracle/sSYSTEM.evm.acceptor.auth", F_OK) = 0
    28268 11:47:03 connect(4, {sa_family=AF_FILE, path="/var/tmp/.oracle/sSYSTEM.evm.acceptor.auth"...}, 110   <<<<<<<
    ========
    This, indicates a problem with the network socket file.

    SOLUTION

    Get the PID of evmd.bin process and kill it

    $ ps -ef | grep 'd.bin'

    oracle   21046 21045  0  2012 ?        00:07:46 /u01/app/ract/crs/bin/evmd.bin         
    root     21054 15845  0  2012 ?        11:34:47 /u01/app/ract/crs/bin/crsd.bin reboot
    oracle   22072 21453  0  2012 ?

            05:44:50 /u01/app/ract/crs/bin/ocssd.bin
    root     22135     1  0  2012 ?

            00:00:00 /u01/app/ract/crs/bin/oclskd.bin
    oracle   22410     1  0  2012 ?        00:00:00 /u01/app/ract/crs/bin/oclskd.bin
    oracle   29834 27854  0 13:22 pts/8    00:00:00 egrep d.bin

    $ kill -9 21046

    After killing evmd.bin process, the command "crsctl check crs" returns the complete output without any hangs.

    [oracle@srv03401 bin]$ ./crsctl check crs

    CSS appears healthy
    CRS appears healthy
    EVM appears healthy

  • 相关阅读:
    第二章 存储,2.1 永不停止的脚步——数据库优化之路(作者:佳毅)
    第一章 基础设施,1.3 阿里视频云ApsaraVideo是怎样让4000万人同时狂欢的(作者:蔡华)
    第一章 基础设施,1.2 双11背后基础设施软硬结合实践创新(作者:希有)
    第一章 基础设施,1.1 万亿交易量级下的秒级监控(作者:郁松、章邯、程超、癫行)
    阿里巴巴2016双11背后的技术(不一样的技术创新)
    java实现Haffman编码
    CentOS7安装Nginx并部署
    ubuntu usb权限问题解决
    Camera图像处理原理及实例分析-重要图像概念
    sensor的skipping and binning 模式
  • 原文地址:https://www.cnblogs.com/liguangsunls/p/6725040.html
Copyright © 2011-2022 走看看