zoukankan      html  css  js  c++  java
  • Linux bug 14258279: scheduling clock overflows in 208 days

    早上同事反映数据库不能用。无法正常登录主机。多次尝试后终于登上主机,检查系统日志发现下述错误:

    BUG: soft lockup - CPU#5 stuck for 17163091988s!
    貌似是操作系统的bug.


    以下是详细信息:
    # uname -ra
    Linux Test-DB01 2.6.32-200.13.1.el5uek #1 SMP Wed Jul 27 21:02:33 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux

    mysql> select version();
    +------------+
    | version()  |
    +------------+
    | 5.5.24-log |
    +------------+
    1 row in set (0.00 sec)

    Dec 18 22:55:44 Test-Db01 kernel: Call Trace:
    Dec 18 22:55:44 Test-Db01 kernel: BUG: soft lockup - CPU#5 stuck for 17163091988s! [mysqld:27243]
    Dec 18 22:55:44 Test-Db01 kernel: Modules linked in: autofs4(U) i2c_dev(U) i2c_core(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) rfkill(U) lockd(U) sunrpc(U) nf_conntrack_netbios_ns(U) ipt_REJECT(U) nf_conntrack_ipv4(U) nf_defrag_ipv4(U) xt_state(U) nf_conntrack(U) xt_tcpudp(U) ip6_tables(U) x_tables(U) be2iscsi(U) rdma_cm(U) ib_cm(U) iw_cm(U) ib_sa(U) ib_mad(U) ib_core(U) ib_addr(U) iscsi_tcp(U) bnx2i(U) cnic(U) uio(U) ipv6(U) cxgb3i(U) libcxgbi(U) cxgb3(U) libiscsi_tcp(U) libiscsi(U) scsi_transport_iscsi(U) video(U) output(U) sbs(U) sbshc(U) parport_pc(U) lp(U) parport(U) joydev(U) ses(U) enclosure(U) bnx2(U) dcdbas(U) serio_raw(U) snd_seq_dummy(U) snd_seq_oss(U) snd_seq_midi_event(U) snd_seq(U) snd_seq_device(U) snd_pcm_oss(U) snd_mixer_oss(U) snd_pcm(U) snd_timer(U) snd(U) soundcore(U) snd_page_alloc(U) iTCO_wdt(U) iTCO_vendor_support(U) pcspkr(U) usb_storage(U) shpchp(U) megaraid_sas(U) [last unloaded: ip_tables]
    Dec 18 22:55:44 Test-Db01 kernel: CPU 5:
    Dec 18 22:55:44 Test-Db01 kernel: Modules linked in: autofs4(U) i2c_dev(U) i2c_core(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) rfkill(U) lockd(U) sunrpc(U) nf_conntrack_netbios_ns(U) ipt_REJECT(U) nf_conntrack_ipv4(U) nf_defrag_ipv4(U) xt_state(U) nf_conntrack(U) xt_tcpudp(U) ip6_tables(U) x_tables(U) be2iscsi(U) rdma_cm(U) ib_cm(U) iw_cm(U) ib_sa(U) ib_mad(U) ib_core(U) ib_addr(U) iscsi_tcp(U) bnx2i(U) cnic(U) uio(U) ipv6(U) cxgb3i(U) libcxgbi(U) cxgb3(U) libiscsi_tcp(U) libiscsi(U) scsi_transport_iscsi(U) video(U) output(U) sbs(U) sbshc(U) parport_pc(U) lp(U) parport(U) joydev(U) ses(U) enclosure(U) bnx2(U) dcdbas(U) serio_raw(U) snd_seq_dummy(U) snd_seq_oss(U) snd_seq_midi_event(U) snd_seq(U) snd_seq_device(U) snd_pcm_oss(U) snd_mixer_oss(U) snd_pcm(U) snd_timer(U) snd(U) soundcore(U) snd_page_alloc(U) iTCO_wdt(U) iTCO_vendor_support(U) pcspkr(U) usb_storage(U) shpchp(U) megaraid_sas(U) [last unloaded: ip_tables]
    Dec 18 22:55:44 Test-Db01 kernel: Pid: 27243, comm: mysqld Not tainted 2.6.32-200.13.1.el5uek #1 PowerEdge R710
    Dec 18 22:55:44 Test-Db01 kernel: RIP: 0033:[<00000000008f95a3>]  [<00000000008f95a3>] 0x8f95a3

    虽然文中提到的是在Exadata X2-8 中遇到的问题. 但测试环境中的操作系统内核和错误现象bug中描述的是基本一致的。

    Exadata X2-8 database servers running Unbreakable Enterprise Kernel for Oracle Linux 2.6.32-100.23.1 that have been continuously up for more than 208 days are susceptible to this problem.  Unbreakable Enterprise Kernel for Oracle Linux 2.6.32-100.23.1 is the Linux kernel provided with Exadata releases 11.2.2.2.0 through 11.2.2.4.2, inclusive.  Uptime may be determined by the uptime(1) command.

    解决方案有两种;

    1. 升级到新版本

    Upgrade to Exadata 11.2.3.1.0 or later (Recommended).

    2. 在系统运行到208天前,重启操作系统 。

    Reboot database servers before uptime reaches 208 days

    目前只能尝试第二种了。

    出问题的是一个 :

    DELL 的PowerEdge R710 的机器,传说这个型号的机器过半年就要宕机一次。

    两种bug出现在一台机器上,真是"巧合"


    参考文档:
    【MOS】ALERT - Exadata X2-8 systems affected by Linux bug 14258279: scheduling clock overflows in 208 days [ID 1473825.1]
    【MOS】Bug 14258279 : [EXADATA] SOFT LOCKUP - CPU#0 STUCK FOR 17163091968S!

  • 相关阅读:
    java实现获取当前年月日 小时 分钟 秒 毫秒
    四种常见的 POST 提交数据方式(application/x-www-form-urlencoded,multipart/form-data,application/json,text/xml)
    Cannot send, channel has already failed:
    Java 枚举(enum) 详解7种常见的用法
    C语言指针详解(经典,非常详细)
    ActiveMQ进阶配置
    Frame size of 257 MB larger than max allowed 100 MB
    SpringJMS解析--监听器
    SpringJMS解析-JmsTemplate
    delphi 修改代码补全的快捷键(由Ctrl+Space 改为 Ctrl + alt + Space)
  • 原文地址:https://www.cnblogs.com/lcword/p/5865538.html
Copyright © 2011-2022 走看看