zoukankan      html  css  js  c++  java
  • 【原创】大叔问题定位分享(28)openssh升级到7.4之后ssh跳转异常

    服务器集群之间忽然ssh跳转不通

    # ssh 192.168.0.1
    The authenticity of host '192.168.0.1 (192.168.0.1)' can't be established.
    RSA1 key fingerprint is 07:e4:54:79:62:60:22:c2:72:23:21:00:54:a0:90:79.
    Are you sure you want to continue connecting (yes/no)?

    输入yes之后要求输入密码,但是之前设置的是免密登录,查看 ~/.ssh 目录下文件均正常

    # ls -l ~/.ssh
    total 16
    -rw------- 1 root root 2040 Jan 10 11:32 authorized_keys
    -rwx------ 1 root root 1679 Jan 10 11:27 id_rsa
    -rwx------ 1 root root 408 Jan 10 11:27 id_rsa.pub
    -rwx------ 1 root root 2753 Jan 10 11:27 known_hosts

    再检查ssh版本

    # ssh -V
    OpenSSH_7.4p1, SSH protocols 1.5/2.0, OpenSSL 0x100020bf

    # yum list installed|grep openssh
    openssh.x86_64 7.4p1-16.el7 @base
    openssh-clients.x86_64 7.4p1-16.el7 @base
    openssh-server.x86_64 7.4p1-16.el7 @base

    # ls -l /usr/sbin/sshd
    -rwxr-xr-x 1 root root 1288984 Feb 13 06:02 /usr/sbin/sshd

    # ps aux|grep sshd
    root 8698 0.0 0.0 25236 1236 ? Ss 06:02 0:00 sshd

    看起来是早上6点时sshd刚升级到7.4(文件更新同时进程重启),检查sshd状态发现有报错

    # service sshd status
    Redirecting to /bin/systemctl status sshd.service
    鈼sshd.service - OpenSSH server daemon
    Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
    Active: activating (auto-restart) (Result: exit-code) since Wed 2019-02-13 13:41:32 CST; 18s ago
    Docs: man:sshd(8)
    man:sshd_config(5)
    Process: 11999 ExecStart=/usr/sbin/sshd -D $OPTIONS (code=exited, status=255)
    Main PID: 11999 (code=exited, status=255)
    Tasks: 0
    Memory: 0B
    CGroup: /system.slice/sshd.service

    Feb 13 13:41:32 $server systemd[1]: sshd.service: main process exited, code=exited, status=255/n/a
    Feb 13 13:41:32 $server sshd[11999]: This private key will be ignored.
    Feb 13 13:41:32 $server sshd[11999]: bad permissions: ignore key: /etc/ssh/ssh_host_rsa_key
    Feb 13 13:41:32 $server sshd[11999]: Could not load host key: /etc/ssh/ssh_host_rsa_key
    Feb 13 13:41:32 $server sshd[11999]: Could not load host key: /etc/ssh/ssh_host_dsa_key
    Feb 13 13:41:32 $server sshd[11999]: Disabling protocol version 2. Could not load host key
    Feb 13 13:41:32 $server systemd[1]: Failed to start OpenSSH server daemon.
    Feb 13 13:41:32 $server systemd[1]: Unit sshd.service entered failed state.
    Feb 13 13:41:32 $server systemd[1]: sshd.service failed.

    尝试手工启动sshd来看下具体的错误

    # sshd -D -p 8822
    @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    @ WARNING: UNPROTECTED PRIVATE KEY FILE! @
    @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    Permissions 0640 for '/etc/ssh/ssh_host_rsa_key' are too open.
    It is recommended that your private key files are NOT accessible by others.
    This private key will be ignored.
    bad permissions: ignore key: /etc/ssh/ssh_host_rsa_key
    Could not load host key: /etc/ssh/ssh_host_rsa_key
    Could not load host key: /etc/ssh/ssh_host_dsa_key
    Disabling protocol version 2. Could not load host key

    看起来是因为key文件权限太大导致ssh protocol 2被禁用

    # ls -l /etc/ssh
    total 612
    -rw-r--r-- 1 root root 581843 Apr 11 2018 moduli
    -rw-r--r-- 1 root root 1144 Feb 13 06:02 ssh_config
    -rw------- 1 root root 2450 Feb 13 06:02 sshd_config
    -rw-r-----. 1 root ssh_keys 227 Jan 22 2018 ssh_host_ecdsa_key
    -rw-r--r--. 1 root root 162 Jan 22 2018 ssh_host_ecdsa_key.pub
    -rw-r-----. 1 root ssh_keys 387 Jan 22 2018 ssh_host_ed25519_key
    -rw-r--r--. 1 root root 82 Jan 22 2018 ssh_host_ed25519_key.pub
    -rw------- 1 root root 991 Feb 13 06:02 ssh_host_key
    -rw-r--r-- 1 root root 656 Feb 13 06:02 ssh_host_key.pub
    -rw-r-----. 1 root ssh_keys 1675 Jan 22 2018 ssh_host_rsa_key
    -rw-r--r--. 1 root root 382 Jan 22 2018 ssh_host_rsa_key.pub

    将/etc/ssh下文件权限全部改为600

    # chmod 600 /etc/ssh/*

    然后使用测试的sshd进程跳转8822端口一切正常,但是sshd service还是不断启动失败,怀疑是因为当前的sshd进程不是通过service启动,所以sshd service不断重启但是无法绑定端口,将sshd进程kill掉,再看sshd service终于启动正常

    # service sshd status
    Redirecting to /bin/systemctl status sshd.service
    鈼sshd.service - OpenSSH server daemon
    Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
    Active: activating (start) since Wed 2019-02-13 13:35:32 CST; 16s ago
    Docs: man:sshd(8)
    man:sshd_config(5)
    Main PID: 4355 (sshd)
    Tasks: 1
    Memory: 416.0K
    CGroup: /system.slice/sshd.service
    鈹斺攢4355 /usr/sbin/sshd -D

    Feb 13 13:35:32 $server systemd[1]: Starting OpenSSH server daemon...
    Feb 13 13:35:32 $server sshd[4355]: Could not load host key: /etc/ssh/ssh_host_dsa_key
    Feb 13 13:35:32 $server sshd[4355]: Server listening on 0.0.0.0 port 22.
    Feb 13 13:35:32 $server sshd[4355]: error: Bind to port 22 on :: failed: Address already in use.

    但是还有一个error: Bind to port 22 on :: failed: Address already in use. 这个是因为在配置文件

    $ vi /etc/ssh/sshd_config
    #Port 22
    #Protocol 2,1
    #ListenAddress 0.0.0.0
    #ListenAddress ::

    默认会绑定ipv4和ipv6的22端口,将其中两行取消注释

    Port 22
    #Protocol 2,1
    ListenAddress 0.0.0.0
    #ListenAddress ::

    启动正常

    Feb 13 15:30:04 $server systemd[1]: Starting OpenSSH server daemon...
    Feb 13 15:30:04 $server sshd[4731]: Could not load host key: /etc/ssh/ssh_host_dsa_key
    Feb 13 15:30:04 $server sshd[4731]: Server listening on 0.0.0.0 port 22.

    但是过一段时间进程就会消失,查看sshd.service

    # cat /usr/lib/systemd/system/sshd.service
    [Unit]
    Description=OpenSSH server daemon
    Documentation=man:sshd(8) man:sshd_config(5)
    After=network.target sshd-keygen.service
    Wants=sshd-keygen.service

    [Service]
    Type=notify
    EnvironmentFile=/etc/sysconfig/sshd
    ExecStart=/usr/sbin/sshd -D $OPTIONS
    ExecReload=/bin/kill -HUP $MAINPID
    KillMode=process
    Restart=on-failure
    RestartSec=42s

    [Install]
    WantedBy=multi-user.target

    通过journalctl查看日志发现

    Feb 13 16:08:09 $server systemd[1]: sshd.service start operation timed out. Terminating.
    Feb 13 16:08:09 $server sshd[26701]: Received signal 15; terminating.
    Feb 13 16:08:09 $server systemd[1]: sshd.service: main process exited, code=exited, status=255/n/a
    Feb 13 16:08:09 $server systemd[1]: Unit sshd.service entered failed state.
    Feb 13 16:08:09 $server systemd[1]: sshd.service failed.
    Feb 13 16:08:51 $server systemd[1]: sshd.service holdoff time over, scheduling restart.

    看起来是因为不断启动超时被中止进程导致;

    # vi /usr/lib/systemd/system/sshd.service
    ExecStart=/usr/sbin/sshd $OPTIONS

    将-D去掉,然后将之前的sshd kill掉,然后重启sshd

    # systemctl daemon-reload
    # systemctl start sshd

    至此sshd服务恢复,ssh跳转恢复,所以openssh升级7.4时一定要修改/etc/ssh/下的文件权限,不然会禁用ssh protocol 2,同时也会导致之前的key失效;同时还要修改sshd.service;等一下,ansible还有报错

    192.168.0.1 | FAILED | rc=-1 >>
    failed to open a SFTP connection (Channel closed.)

    查看sshd_config

    # vi /etc/ssh/sshd_config
    Subsystem sftp /usr/libexec/sftp-server

    发现sftp-server路径不存在/usr/libexec/sftp-server,实际的路径是/usr/libexec/openssh/sftp-server,修改之后重启sshd,上面问题修复,再等一下,scp也有报错

    # scp 192.168.0.1:/file1 .
    command-line: line 0: Bad configuration option: PermitLocalCommand

    这个是由于scp、sftp和ssh版本不匹配导致的,通过yum查看这几个命令都在openssh-clients包中,但是这几个命令的最后修改时间不一样,重装openssh-clients

    # yum reinstall openssh-clients

    然后这几个命令的最后修改时间都一致了,问题也消失了;

    参考:

    https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html#managed-node-requirements

    https://unix.stackexchange.com/questions/390224/openssh-server-start-failed-with-result-timeout/400644#400644

    http://www.webopius.com/content/350/solution-to-error-command-line-line-0-bad-configuration-option-permitlocalcommand

  • 相关阅读:
    php 多进程
    关于TP的RBAC的使用
    谈谈自己对于Auth2.0的见解
    php 写队列
    关于thinkphp中Hook钩子的解析
    JS的闭包
    单链表的查找和取值-1
    shell输入输出重定向
    转-Visual Studio控制台程序输出窗口一闪而过的解决方法
    linux下如何调用不同目录下的.h 库文件
  • 原文地址:https://www.cnblogs.com/barneywill/p/10369459.html
Copyright © 2011-2022 走看看