zoukankan      html  css  js  c++  java
  • 在Linux上分析硬件检测日志

    数据库管理员在数据库的运维过程中或多或少要和操作系统乃至硬件打上交道,分析数据库故障时操作系统日志往往也是一个重要的线索来源。 以Linux操作系统为例,其主要的日志子系统(syslog subsystem)可大致分为三类:即1)用户连接日志 2)进程统计日志 3)系统和服务日志。 前2种在我们进行系统的安全审计及用户监控时可以派上用场,而因操作系统或硬件问题造成的数据库故障,我们往往需要关注系统和服务日志。在Linux上我们最常分析的是/var/log/messages日志文件,该日志文件包含了系统和服务的info信息(除mail,cron等服务外),这里我们要介绍的是/var/log/dmesg日志文件,该日志文件描述了系统开机时BIOS硬件加载成功与否的信息,以及网卡、光驱、软驱驱动和RAID、LVM、IPv6等的配置信息。此日志文件的信息记录存放在内核缓存中,主要用于硬件信息故障检测。用户既可以使用cat /var/log/dmesg命令来查看该日志信息,也直接可以使用dmesg命令来查看该日志信息。如:
    [root@nas ~]# dmesg |egrep "sd|eth"
    SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB)
    sda: Write Protect is off
    sda: Mode Sense: 00 3a 00 00
    SCSI device sda: drive cache: write back
    SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB)
    sda: Write Protect is off
    sda: Mode Sense: 00 3a 00 00
    SCSI device sda: drive cache: write back
     sda: sda1 sda2 sda3 sda4
    sd 0:0:0:0: Attached scsi disk sda
    eth0: RTL8168d/8111d at 0xffffc20000032000, b8:ac:6f:dc:8b:43, XID 081000c0 IRQ 50
    sd 0:0:0:0: Attached scsi generic sg0 type 0
    SCSI device sdb: 976773168 512-byte hdwr sectors (500108 MB)
    sdb: Write Protect is off
    sdb: Mode Sense: 10 00 00 00
    sdb: assuming drive cache: write through
    SCSI device sdb: 976773168 512-byte hdwr sectors (500108 MB)
    sdb: Write Protect is off
    sdb: Mode Sense: 10 00 00 00
    sdb: assuming drive cache: write through
     sdb: sdb1 sdb2
    sd 2:0:0:0: Attached scsi disk sdb
    sd 2:0:0:0: Attached scsi generic sg2 type 0
    EXT3 FS on sda1, internal journal
    EXT3 FS on sda2, internal journal
    Adding 5116692k swap on /dev/sda3.  Priority:-1 extents:1 across:5116692k
    r8169: eth0: link up
    r8169: eth0: link up
    eth0: no IPv6 routers present
    
    /* 以上列出了系统识别的scsi硬盘及网卡的信息*/
    
    
    [root@nas ~]# cat /var/log/messages |grep -i fail
    Jan 17 03:04:03 nas udevd-event[2943]: wait_for_sysfs: waiting for 
    '/sys/devices/pci0000:00/0000:00:1d.7/usb2/2-1/2-1:1.0/host3/target3:0:0/3:0:0:0/ioerr_cnt' failed
    Jan 18 04:45:08 nas udevd-event[5138]: wait_for_sysfs: waiting for 
    '/sys/devices/pci0000:00/0000:00:1d.7/usb2/2-1/2-1:1.0/host8/target8:0:0/8:0:0:0/ioerr_cnt' failed
    Jan 18 04:45:08 nas kernel: sdb : READ CAPACITY failed.
    Jan 18 04:45:08 nas kernel: sdb : READ CAPACITY failed.
    
    /* 以上列出了硬件检测失败记录 */
    
    [root@nas ~]# dmesg |grep -i err
    ACPI: IRQ0 used by override.
    ACPI: IRQ2 used by override.
    ACPI: IRQ9 used by override.
    Using local APIC timer interrupts.
    ACPI: Using IOAPIC for interrupt routing
    ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
    ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT]
    ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P4._PRT]
    ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P6._PRT]
    ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 6 7 10 *11 12 14 15)
    ACPI: PCI Interrupt Link [LNKB] (IRQs *5)
    ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 6 7 *10 11 12 14 15)
    ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 6 7 10 11 12 14 *15)
    ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 6 7 10 11 12 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 6 7 10 11 12 *14 15)
    ACPI: PCI Interrupt Link [LNKG] (IRQs *3 4 6 7 10 11 12 14 15)
    ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 6 *7 10 11 12 14 15)
    ACPI: PCI Interrupt 0000:00:1c.0[A] -> GSI 17 (level, low) -> IRQ 169
    ACPI: PCI Interrupt 0000:00:1c.2[C] -> GSI 18 (level, low) -> IRQ 177
    ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
    ACPI: PCI Interrupt 0000:00:1a.7[C] -> GSI 18 (level, low) -> IRQ 177
    ACPI: PCI Interrupt 0000:00:1d.7[A] -> GSI 23 (level, low) -> IRQ 209
    ACPI: PCI Interrupt 0000:00:1a.0[A] -> GSI 16 (level, low) -> IRQ 217
    ACPI: PCI Interrupt 0000:00:1a.1[B] -> GSI 21 (level, low) -> IRQ 225
    ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 23 (level, low) -> IRQ 209
    ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 233
    ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 177
    ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 233
    ACPI: PCI Interrupt 0000:00:1f.3[C] -> GSI 18 (level, low) -> IRQ 177
    ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 18 (level, low) -> IRQ 177
    ACPI: PCI Interrupt 0000:00:1b.0[A] -> GSI 22 (level, low) -> IRQ 58
    
    /* 以上列出了硬件检测错误记录 */
    
    /var/log/dmesg硬件检测日志的格式较为简单,一般为"device name:message text"的形式。该日志中常见的设备名称有:SCSI,PCI,Memory,loop,Kernel,EXT3,DMA,CPU,Console,BIOS,ata2,ata1,ACPI,floppy,Time等。其中ACPI(Advanced Configuration and Power Interface)即高级电源管理服务,可以看到以上日志中该服务的PCI中断出现了某些问题,而sdb移动磁盘则出现了"READ CAPACITY failed."(结合之前的日志可能是因为USB外接硬盘未准备好)的失败,若该问题持续可能导致该移动硬盘无法挂载(mount)。
  • 相关阅读:
    python--DenyHttp项目(2)--ACM监考客户端测试版☞需求分析
    python--DenyHttp项目(1)--调用cmd控制台命令os.system()
    python--DenyHttp项目(1)--GUI:tkinter☞ module 'tkinter' has no attribute 'messagebox'
    python--DenyHttp项目(1)--socket编程:服务器端进阶版socketServer
    python--DenyHttp项目(1)--socket编程:客户端与服务器端
    python生成excel格式座位表
    PythonTip--一马当先--bfs
    python pygame--倒计时
    修改Hosts文件,禁止访问指定网页
    字符串常用-----atof()函数,atoi()函数
  • 原文地址:https://www.cnblogs.com/macleanoracle/p/2967674.html
Copyright © 2011-2022 走看看