======================= 我的环境 ==========================
PC 端: win7 + vmware-15 ubuntu16.04
开发板:Freescale i.MX6 单CPU Linux-4.1.15
交叉编译器为 arm-none-linux-gnueabi-gcc(gcc version 4.6.3)
===========================================================
1. 前言
上篇博文介绍了应用程序调试工具 gdb + gdbserver, 那有没有调试内核的呢? 没错, 就是本文介绍的kgdb, 当然早期有kdb, 后面kdb合并到kgdb了, 作为kgdb的前端, 后面我们会介绍, 而kgdb工具跟开发板通信支持kgdboc(串口)和kgdboe(网络),但新版内核只整合kgdboc, 网络被废弃了, 所以下文我们只介绍串口通信。
串口通信有个问题就是, 如果开发板有多余的串口接出来是最好的, 但一般只有控制台console接出来, 所以当我们占用console作为kgdboc的通信接口, 那内核printk等打印我们是没办法通过shell CRT软件看到的, 只有退出kgdb的时候才可以使用,
另外需要非常注意的是, 虚拟机必须用vmware, 不能用virtbox, 我用vbox-6.4版本的经常通信一会儿就没反应了。
2. kernel配置选项
Kernel hacking --->
[*] KGDB: kernel debugger --->
如果想用kdb 则选上“KGDB_KDB: include kdb frontend for kgdb”, 但触发kgdb是会先进入kdb模式, 也配置文件多出以下几个选项(注意红色):
--- target/linux/imx6ul/config-4.1 (revision 8040) +++ target/linux/imx6ul/config-4.1 (working copy) @@ -386,10 +386,12 @@ # CONFIG_COMPILE_TEST is not set CONFIG_CONFIGFS_FS=y # CONFIG_CONNECTOR is not set +CONFIG_CONSOLE_POLL=y CONFIG_CONSOLE_TRANSLATIONS=y # CONFIG_CORDIC is not set CONFIG_COREDUMP=y # CONFIG_CORESIGHT is not set +# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set # CONFIG_CPUFREQ_DT is not set CONFIG_CPU_32v6K=y CONFIG_CPU_32v7=y @@ -550,7 +552,10 @@ # CONFIG_DEBUG_FS is not set # CONFIG_DEBUG_GPIO is not set CONFIG_DEBUG_IMX_UART_PORT=1 -# CONFIG_DEBUG_INFO is not set +CONFIG_DEBUG_INFO=y +# CONFIG_DEBUG_INFO_DWARF4 is not set +# CONFIG_DEBUG_INFO_REDUCED is not set +# CONFIG_DEBUG_INFO_SPLIT is not set CONFIG_DEBUG_KERNEL=y # CONFIG_DEBUG_KMEMLEAK is not set # CONFIG_DEBUG_KOBJECT is not set @@ -751,6 +756,7 @@ CONFIG_FW_LOADER_USER_HELPER=y CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y # CONFIG_GAMEPORT is not set +# CONFIG_GDB_SCRIPTS is not set # CONFIG_GENERIC_ADC_BATTERY is not set CONFIG_GENERIC_ALLOCATOR=y CONFIG_GENERIC_BUG=y @@ -1139,6 +1145,9 @@ # CONFIG_JUMP_LABEL is not set CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y +CONFIG_KDB_CONTINUE_CATASTROPHIC=0 +CONFIG_KDB_DEFAULT_ENABLE=0x1 +# CONFIG_KDB_KEYBOARD is not set CONFIG_KERNEL_GZIP=y # CONFIG_KERNEL_LZ4 is not set # CONFIG_KERNEL_LZMA is not set @@ -1149,7 +1158,10 @@ # CONFIG_KEXEC is not set CONFIG_KEYBOARD_SNVS_PWRKEY=y CONFIG_KEYS=y -# CONFIG_KGDB is not set +CONFIG_KGDB=y +CONFIG_KGDB_KDB=y +CONFIG_KGDB_SERIAL_CONSOLE=y +# CONFIG_KGDB_TESTS is not set # CONFIG_KMX61 is not set # CONFIG_KPROBES is not set # CONFIG_KSM is not set @@ -2011,6 +2023,7 @@ # CONFIG_SERIAL_IFX6X60 is not set CONFIG_SERIAL_IMX=y CONFIG_SERIAL_IMX_CONSOLE=y +# CONFIG_SERIAL_KGDB_NMI is not set # CONFIG_SERIAL_MAX3100 is not set # CONFIG_SERIAL_MAX310X is not set # CONFIG_SERIAL_NONSTANDARD is not set
3. 启动参数
上面选项只是编译相关调试代码, 如何告知kgdboc使用哪个串口呢? 一般有两种方式:
a. 在uboot传给kernel的cmdline 加上关键字 “console=ttyAMA4,115200 kgdboc=ttyAMA4,115200” b. 系统起来后 echo "kgdboc=ttyAMA4,115200" > /sys/module/kgdboc/parameters/kgdboc
启动过程中出现log:
[ 2.055290] KGDB: Registered I/O driver kgdboc
4. 触发进入kgdb调试模式
在控制台输入: echo g > /proc/sysrq-trigger
如果没有选中配置选项“KGDB_KDB: include kdb frontend for kgdb”, 会直接进入kgdb模式, 否则需要在kdb下键入命令“kgdb”
5. 连接开发板
首先我的PC机是win7系统, 虚拟机vmware-15装ubuntu-16.04, 然后串口配置为虚拟机独占win7的COM1, 对应就是ubuntu的/dev/ttyS0
开发板跑的是Linux的Image镜像, 但调试得是带有调试信息的vmlinux, 同时也要有源码, 跟调试应用程序一样道理, 跑到kernel根目录下执行:
linux-4.1.15$ sudo /opt/toolchain/arm-2012.03/bin/arm-none-linux-gnueabi-gdb vmlinux GNU gdb (Sourcery CodeBench Lite 2012.03-57) 7.2.50.20100908-cvs Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=i686-pc-linux-gnu --target=arm-none-linux-gnueabi". For bug reporting instructions, please see: <https://support.codesourcery.com/GNUToolchain/>... Reading symbols from /home/vedic/project/firmware_3/build_dir/linux-imx6ul_imx6_pax/linux-4.1.15/vmlinux...done. (gdb) set remotebaud 115200 (gdb) target remote /dev/ttyS0 Remote debugging using /dev/ttyS0 kgdb_breakpoint () at kernel/debug/debug_core.c:1071 1071 arch_kgdb_breakpoint(); (gdb) set detach-on-fork on /* 后面有解释 */ (gdb) l 1066 */ 1067 noinline void kgdb_breakpoint(void) 1068 { 1069 atomic_inc(&kgdb_setting_breakpoint); 1070 wmb(); /* Sync point before breakpoint */ 1071 arch_kgdb_breakpoint(); 1072 wmb(); /* Sync point after breakpoint */ 1073 atomic_dec(&kgdb_setting_breakpoint); 1074 } 1075 EXPORT_SYMBOL_GPL(kgdb_breakpoint); (gdb) step 0xc00619bc in arch_kgdb_breakpoint () at kernel/debug/debug_core.c:1070 1070 wmb(); /* Sync point before breakpoint */ (gdb) step kgdb_breakpoint () at kernel/debug/debug_core.c:1072 1072 wmb(); /* Sync point after breakpoint */ (gdb) l 1067 noinline void kgdb_breakpoint(void) 1068 { 1069 atomic_inc(&kgdb_setting_breakpoint); 1070 wmb(); /* Sync point before breakpoint */ 1071 arch_kgdb_breakpoint(); 1072 wmb(); /* Sync point after breakpoint */ 1073 atomic_dec(&kgdb_setting_breakpoint); 1074 } 1075 EXPORT_SYMBOL_GPL(kgdb_breakpoint); 1076 (gdb) step 1073 atomic_dec(&kgdb_setting_breakpoint); (gdb) b wake_up_process /* 设置断点, 这个函数会被系统频繁调度的 */ Breakpoint 1 at 0xc003b858: file kernel/sched/core.c, line 1762. (gdb) c Continuing. Breakpoint 1, wake_up_process (p=0xc8279c00) at kernel/sched/core.c:1762 1762 WARN_ON(task_is_stopped_or_traced(p)); (gdb) l 1757 * It may be assumed that this function implies a write memory barrier before 1758 * changing the task state if and only if any tasks are woken up. 1759 */ 1760 int wake_up_process(struct task_struct *p) 1761 { 1762 WARN_ON(task_is_stopped_or_traced(p)); 1763 return try_to_wake_up(p, TASK_NORMAL, 0); 1764 } 1765 EXPORT_SYMBOL(wake_up_process); 1766 (gdb) bt #0 wake_up_process (p=0xc8279c00) at kernel/sched/core.c:1762 #1 0xc0033484 in __queue_work (cpu=0, wq=0xc819aa80, work=0xc06d4f94) at kernel/workqueue.c:1386 #2 0xc004dc48 in call_timer_fn (timer=<value optimized out>, fn=0xc0033494 <delayed_work_timer_fn>, data=<value optimized out>) at kernel/time/timer.c:1153 #3 0xc004de1c in __run_timers (h=<value optimized out>) at kernel/time/timer.c:1221 #4 run_timer_softirq (h=<value optimized out>) at kernel/time/timer.c:1415 #5 0xc00267e8 in __do_softirq () at kernel/softirq.c:273 #6 0xc0026b64 in do_softirq_own_stack () at include/linux/interrupt.h:446 #7 invoke_softirq () at kernel/softirq.c:357 #8 irq_exit () at kernel/softirq.c:391 #9 0xc00467f4 in __handle_domain_irq (domain=0xc8006000, hwirq=<value optimized out>, lookup=true, regs=<value optimized out>) at kernel/irq/irqdesc.c:391 #10 0xc0009340 in handle_domain_irq (regs=0xc8715e78) at include/linux/irqdesc.h:147 #11 gic_handle_irq (regs=0xc8715e78) at drivers/irqchip/irq-gic.c:275 #12 0xc0011b40 in __irq_svc () at arch/arm/kernel/entry-armv.S:206 Backtrace stopped: frame did not save the PC (gdb) next 1761 { (gdb) next 1762 WARN_ON(task_is_stopped_or_traced(p)); (gdb) next 1763 return try_to_wake_up(p, TASK_NORMAL, 0); (gdb) step 1764 } (gdb) step Cannot access memory at address 0x2 (gdb) try_to_wake_up (p=0xc8279c00, state=3, wake_flags=0) at kernel/sched/core.c:1657 1657 { (gdb) l 1652 * Return: %true if @p was woken up, %false if it was already running. 1653 * or @state didn't match @p's state. 1654 */ 1655 static int 1656 try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) 1657 { 1658 unsigned long flags; 1659 int cpu, success = 0; 1660 1661 /* (gdb) s 1668 raw_spin_lock_irqsave(&p->pi_lock, flags); (gdb) n 1669 if (!(p->state & state)) (gdb) n 1707 raw_spin_unlock_irqrestore(&p->pi_lock, flags); (gdb) n 1710 } (gdb) n __queue_work (cpu=0, wq=0xc819aa80, work=0xc06d4f94) at kernel/workqueue.c:1389 1389 } (gdb) n call_timer_fn (timer=<value optimized out>, fn=0xc0033494 <delayed_work_timer_fn>, data=<value optimized out>) at kernel/time/timer.c:1158 1158 if (count != preempt_count()) { (gdb) n 1169 } (gdb) n __run_timers (h=<value optimized out>) at kernel/time/timer.c:1204 1204 while (!list_empty(head)) { (gdb) c Continuing. Breakpoint 1, wake_up_process (p=0xc8278380) at kernel/sched/core.c:1762 1762 WARN_ON(task_is_stopped_or_traced(p)); (gdb) c Continuing. Breakpoint 1, wake_up_process (p=0xc8106e00) at kernel/sched/core.c:1762 1762 WARN_ON(task_is_stopped_or_traced(p)); (gdb) c Continuing. Breakpoint 1, wake_up_process (p=0xc804e700) at kernel/sched/core.c:1762 1762 WARN_ON(task_is_stopped_or_traced(p)); (gdb)
在用gdb来调试内核的时候,由于内核在初始化的时候,会创建很多子线程。而默认gdb会接管所有的线程,如果你从一个线程切换到另外一个线程,gdb会马上把原先的线程暂停。但是这样很容易导致kernel死掉,所以需要设置一下gdb。 一般用gdb进行多线程调试,需要注意两个参数:follow-fork-mode和detach-on-fork。 detach-on-fork参数,指示GDB在fork之后是否断开(detach)某个进程的调试,或者都交由GDB控制: set detach-on-fork [on|off] on: 断开调试follow-fork-mode指定的进程。 off: gdb将控制父进程和子进程。 follow-fork-mode指定的进程将被调试,另一个进程置于暂停(suspended)状态。follow-fork-mode的用法为: set follow-fork-mode [parent|child] parent: fork之后继续调试父进程,子进程不受影响。 child: fork之后调试子进程,父进程不受影响。
当没有断点, 输入continue让系统跑的时候, 串口将不会被kgdboc占用, 所以控制台又可以用了
6. 调试module.ko
略......
7. 注意事项
a. 虚拟机用vmware, vbox串口有问题 b. 确保kgdboc加载成功, 如果出现“kgdb: Unregistered I/O driver, debugger disabled” 或者没有节点“/sys/module/kgdboc/parameters/kgdboc” 很有可能串口驱动没有支持如下三个函数: struct uart_ops { #ifdef CONFIG_CONSOLE_POLL int (*poll_init)(struct uart_port *); void (*poll_put_char)(struct uart_port *, unsigned char); int (*poll_get_char)(struct uart_port *); #endif }; 其实也简单, 在所在的串口驱动必然有提供读写TX/RX寄存器和判断FIFO的函数, 直接while查询即可 static int serial_get_poll_char(struct uart_port *port) { if((serial_in(port, ARM_UART_STS1) & 0xff) == 0) return NO_POLL_CHAR; return serial_in(port, ARM_UART_RXD); } static inline void wait_for_xmitr(struct uart_port *port); static void serial_put_poll_char(struct uart_port *port, unsigned char c) { wait_for_xmitr(port); serial_out(port, ARM_UART_TXD, c); } c. 如果希望系统启动过程中就进入kgdb, 而不是后面“echo g > /proc/sysrq-trigger” cmdline改成 -> “console=ttyAMA4,115200 kgdboc=ttyAMA4,115200 kgdbwait” d. PC端ubuntu用的gdb还是跟上篇博文的一样
建议也看一下上篇博文: https://www.cnblogs.com/vedic/p/11104204.html