zoukankan      html  css  js  c++  java
  • linux 3.10的kdump配置的小坑

    之前在2.6系列linux内核中,当发现某个模块不要在保留内核中加载的时候,可以通过blacklist参数将其在/etc/kdump.conf中屏蔽

    blacklist <list of kernel modules>

    最近发现某个sas驱动存在问题,所以打算也这么屏蔽,结果,出错了:

    [root@localhost ~]# service kdump restart
    Redirecting to /bin/systemctl restart kdump.service
    Job for kdump.service failed because the control process exited with error code. See "systemctl status kdump.service" and "journalctl -xe" for details.
    [root@localhost ~]# systemctl status kdump.service
    * kdump.service - Crash recovery kernel arming
       Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled)
       Active: failed (Result: exit-code) since Tue 2017-11-28 11:58:28 UTC; 10s ago
      Process: 60563 ExecStop=/usr/bin/kdumpctl stop (code=exited, status=0/SUCCESS)
      Process: 60572 ExecStart=/usr/bin/kdumpctl start (code=exited, status=1/FAILURE)
     Main PID: 60572 (code=exited, status=1/FAILURE)
    
    Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Deprecated kdump config option: blacklist. Refer to kdump.conf manpage for alternatives.
    Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Starting kdump: [FAILED]
    Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service: main process exited, code=exited, status=1/FAILURE
    Nov 28 11:58:28 localhost.localdomain systemd[1]: Failed to start Crash recovery kernel arming.
    Nov 28 11:58:28 localhost.localdomain systemd[1]: Unit kdump.service entered failed state.
    Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service failed.
    [root@localhost ~]# journalctl -xe
    Nov 28 11:58:28 localhost.localdomain kdumpctl[60563]: kexec: unloaded kdump kernel
    Nov 28 11:58:28 localhost.localdomain kdumpctl[60563]: Stopping kdump: [OK]
    Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Deprecated kdump config option: blacklist. Refer to kdump.conf manpage for alternatives.
    Nov 28 11:58:28 localhost.localdomain kdumpctl[60572]: Starting kdump: [FAILED]
    Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service: main process exited, code=exited, status=1/FAILURE
    Nov 28 11:58:28 localhost.localdomain systemd[1]: Failed to start Crash recovery kernel arming.
    -- Subject: Unit kdump.service has failed
    -- Defined-By: systemd
    -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
    -- 
    -- Unit kdump.service has failed.
    -- 
    -- The result is failed.
    Nov 28 11:58:28 localhost.localdomain systemd[1]: Unit kdump.service entered failed state.
    Nov 28 11:58:28 localhost.localdomain systemd[1]: kdump.service failed.
    Nov 28 11:58:28 localhost.localdomain polkitd[2087]: Unregistered Authentication Agent for unix-process:60547:533046 (system bus name :1.5128, object path /org/freedesktop/PolicyKit1/AuthenticationAgent
    [root@localhost ~]# 

    发现blacklist是过时的用法了,然后参照提示:

    man kdump.conf 看到如下打印:

    blacklist option was recently being used to prevent loading modules in initramfs. General terminology for blacklist has been that module is present in initramfs but it is not actu-
    ally loaded in kernel. Hence retaining blacklist option creates more confusing behavior. It has been deprecated.
    
    Instead, use rd.driver.blacklist option on second kernel to blacklist a certain module. One can edit /etc/sysconfig/kdump.conf and edit KDUMP_COMMANDLINE_APPEND to pass kernel com-
    mand line options. Refer to dracut.cmdline man page for more details on module blacklist option.

    好吧,按照最新的要求,打算修改/etc/sysconfig/kdump.conf,发现这个文件不存在,当然配置文件路径不是关键,/etc/kdump.conf里面配置也行,

    我按照manpage的提示,修改文件名是/etc/sysconfig/kdump,然后修改KDUMP_COMMANDLINE_APPEND这行命令,具体的格式参考:

    man  dracut.cmdline

      rd.driver.blacklist=<drivername>[,<drivername>,...]
               do not load kernel module <drivername>. This parameter can be specified multiple times.
    
           rd.driver.pre=<drivername>[,<drivername>,...]
               force loading kernel module <drivername>. This parameter can be specified multiple times.
    
           rd.driver.post=<drivername>[,<drivername>,...]
               force loading kernel module <drivername> after all automatic loading modules have been loaded. This parameter can be specified multiple times.

    另外需要注意的是,当修改了配置,就要重启kdump服务,而这个时候,由于修改了blacklist,会导致重启的时候比较慢,因为在涉及到配置文件变动时,如生成路径修改或blacklist内容增加,都需要重新生成kdump的RAM文件,不然其在发生问题时还是使用老的img RAM文件,这类文件在/boot下以kdump.img结尾的文件就是:

    [root@localhost ~]# ls -l /boot/*kdump*
    -rw------- 1 root root 16878919 Nov 29 01:02 /boot/initramfs-3.10.0-693.5.2.el7.x86_64kdump.img
    -rw------- 1 root root 35261890 Nov 27 07:04 /boot/initramfs-3.10.0caq1.0kdump.img
    -rw------- 1 root root 36508192 Nov 24 06:21 /boot/initramfs-3.10.0kdump.img
    [root@localhost ~]# 

     最后需要注意的就是,当配置的保留内核在加载驱动或者运行的时候,遇到panic,这个时候就再也没有内核去接管它了,只能在屏幕上打印,或者接串口查看。之前遇到过保留内存不够的

    情况下,保留内核自己出现oom了,导致无法收集到crash,查看当前的保留内存可以使用:

    [root@localhost ~]# cat /sys/kernel/kexec_crash_size
    536870912

    查看保留内核是否加载,可以使用:
    [root@localhost ~]# cat /sys/kernel/kexec_crash_loaded
    1

    水平有限,如果有错误,请帮忙提醒我。如果您觉得本文对您有帮助,可以点击下面的 推荐 支持一下我。版权所有,需要转发请带上本文源地址,博客一直在更新,欢迎 关注 。
  • 相关阅读:
    淘宝的高级商业阶段
    大淘宝的终极商业阶段
    淘宝的中级商业阶段
    淘宝的初级商业阶段
    BAM部署失败 未能加载”AdomdClient”或它的某一个依赖项。系统找不到指定的文件
    Step by Step WebMatrix网站开发之二:使用WebMatrix(3)
    Ext JS 4.0.1更新说明(未翻译)
    企业级系统架构的理解
    Step by Step WebMatrix网站开发之二:使用WebMatrix(2)
    四大类NoSQL数据库
  • 原文地址:https://www.cnblogs.com/10087622blog/p/7918606.html
Copyright © 2011-2022 走看看