zoukankan      html  css  js  c++  java
  • pssh nohup 出现的问题

    1. 有一个shell脚本 a.sh

      #!/bin/bash
      #/home/test/a.sh
      i=0
      while [ $i -lt 2 ]
      do
      sleep 70
      echo 'good'
      let i++
      done

    2. pssh -H "host1 host2" "nohup /home/test/a.sh &"
      会报错:
      [1] 21:26:32 [FAILURE] host1 Timed out, Killed by signal 9
      [2] 21:26:32 [FAILURE] host2 Timed out, Killed by signal 9
      使用ssh连到host1和host2会发现:
      a.sh 已经不存在

    解决办法:
    pssh -H "host1 host2" "nohup /home/test/a.sh &>> /home/test/nohup.out &"

    以下是对上面问题的原因分析,由于能力所限,自然可能存在误解,请包涵。

    3. 原因分析

    3.1 pssh问题

    查看pssh源码,可知,pssh
    调用了subprocess.Popen 执行ssh命令,如果超过默认超时时间(60s), 则会自动kill。 因此可知,[1] 21:26:32 [FAILURE] host1 Timed out, Killed by signal 9, 这种错误是由pssh设置引起的。

    _DEFAULT_TIMEOUT = 60
    def _kill(self):
        """Signals the process to terminate."""
        if self.proc:
            try:
                os.kill(-self.proc.pid, signal.SIGKILL)
            except OSError:
                # If the kill fails, then just assume the process is dead.
                pass
            self.killed = True
    
    def timedout(self):
        """Kills the process and registers a timeout error."""
        if not self.killed:
            self._kill()
            self.failures.append('Timed out')
    

    3.2 ssh问题

    3.2.1 那么为什么使用ssh执行了"nohup /home/test/a.sh &" 没有立即返回呢?

    看一下这个, 对输出重定向就能立即返回。

    ssh host "(command 1; command 2; ...) &>/dev/null &"

    3.2.2 为什么重定向输出就可以返回?

    看一下stackoverflow, 说是因为竞态条件。继续
    看 Race Condition Details。

    Race Condition Details
    As an example, let's take the simple case of:
    ssh server cat foo.txt
    This should result in the entire contents of the file foo.txt coming back to the client — but in fact, it may not. Consider the following sequence of events:
    The SSH connection is set up; sshd starts the target account's shell as shell -c "cat foo.txt" in a child process, reading the shell's stdout and sending the data over the SSH connection. sshd is waiting for the shell to exit.
    The shell, in turn, starts cat foo.txt in a child process, and waits for it to exit. The file data from foo.txt which cat write to its stdout, however, does not pass through the shell process on its way to sshd. cat inherits its stdout file descriptor (fd) from it parent process, the shell — that fd is a direct reference to the pipe connecting the shell's stdout to sshd.
    cat writes the last chunk of data from foo.txt, and exits; the data is passed to the kernel via the write system call, and is waiting in the pipe buffer to be read by sshd. The shell, which was waiting on the cat process, exits, and then sshd in turn exits, closing the SSH connection. However, there is a race condition here: through the vagaries of process scheduling, it is possible that sshd will receive and act on the SIGCHLD notifying it of the shell's exit, before it reads the last chunk of data from the pipe. If so, then it misses that data.
    This sequence of events can, for example, cause file truncation when using scp.

    4. 个人理解

    在ssh执行nohup命令而没有重定向输出(标准输出和标准错误)的情况下,nohup命令继承了父进程(pssh)的输出,因此父进程仍然会等待nohup退出,而pssh被kill后,由于没有了输出,nohup异常退出。如果重定向了nohup的输出,相当于取消了这种依赖关系,因此pssh不再等待nohup退出,而nohup也能正常执行下去。

  • 相关阅读:
    rman进行备份、恢复
    [每日一题] 11gOCP 1z0-053 :2013-10-7 the backup of MULT_DATA................................32
    容器可以简化图形化界面的设计,以整体结构来布置界面
    组件(Conponent)是图形用户界面最基本的部分
    java.awt包提供了基本的java程序的GUI设计工具
    Window对应的类为java.awt.Windows, 它可独立于其他Container而存在
    Container类是Component的子类,它也是一个抽象类,它允许其他的组件(Component)加入其中
    AWT从概念产生到完成实现只用了一个月
    Java释出的时候,AWT作为Java最弱的组件受到不小的批评
    抽象窗口工具包AWT (Abstract Window Toolkit) 是 API为Java 程序提供的建立 图形用户界面
  • 原文地址:https://www.cnblogs.com/lyg-blog/p/12046138.html
Copyright © 2011-2022 走看看