树莓派4B高级11oops触发panic以及crash工具分析(2)
上篇介绍了kdump功能以及手动触发panic来生成vmcore文件,这里介绍下通过yocto编译oops内核模块触发panic,并通过crash工具进行分析。
环境ubuntu 22.04raspiberry 4BU盘 64Gyocto 4.1.2vmware 16.2.4kernel 5.15.56kexec-tools 2.0.25crash 8.0.0编写oops内核模块新建meta-test layer,并添加到bblayer.conf中:$ bitbake-layers create-layer meta-test
rpi-build/conf/bblayers.confBBLAYERS ?= " ... xxx/02.yocto/poky/meta-test "在meta-test/ecipes-example目录下新建oops菜单和源码,目录如下:02.yocto/poky/meta-test/recipes-example$ tree -L 3 -n . example example_0.1.bb oops files Makefile oops_test.c oops_1.0.bb 3 directories, 4 files
oops_1.0.bb:SUMMARY = "Example of how to build an external Linux kernel module" DESCRIPTION = "oops test" LICENSE = "MIT" LIC_FILES_CHKSUM = "file://${COMMON_LICENSE_DIR}/MIT;md5=0835ade698e0bcf8506ecda2f7b4f302" inherit module SRC_URI = "file://oops_test.c file://Makefile " S = "${WORKDIR}" RPROVIDES_${PN} += "kernel-module-oops"
注:LIC_FILES_CHKSUM 通过命令自行查询:
yocto/poky/meta/files/common-licenses$ md5sum MIT
0835ade698e0bcf8506ecda2f7b4f302 MIT
oops_test.c:(参考《奔跑吧 linux内核 入门篇》)#include #include #include static void create_oops(void) { *(int *)0 = 0; //空指针访问报错 } static int __init my_oops_init(void) { printk("oops module init "); create_oops(); return 0; } static void __exit my_oops_exit(void) { printk("goodbye "); } MODULE_LICENSE("GPL"); module_init(my_oops_init); module_exit(my_oops_exit);
Makefile:obj-m := oops_test.o SRC := $(shell pwd) KBUILD_CFLAGS += -g all: $(MAKE) -C $(KERNEL_SRC) M=$(SRC) $(CFLAGS) modules_install: $(MAKE) -C $(KERNEL_SRC) M=$(SRC) $(CFLAGS) modules_install clean: rm -f *.mod modules.order Module.symvers 在conf/local.conf中把oops模块编译进image:IMAGE_INSTALL:append = " kexec-tools makedumpfile oops" # add oops内核默认使用了xz的压缩模式,会使生成的oops模块以压缩形式存在oops.ko.xz。通过menuconfig关闭该特性:
Module compression modeyocto 安装crash工具
crash是一个分析kdump文件的工具,可以在github下载源码自行交叉编译,也可以通过yocto编译平台引入。默认layer不包含crash,需要借助meta-openembedded,OpenEmbedded包含了上千的软件recipe,包含有meta-filesystems、meta-initramfs meta-multimedia meta-networking、meta-oe等很多有用的layer,每个layer又包含有很多recipe。下载OpenEmbedded,这里使用的分支是langdale,该分支维护比较好。git clone https://github.com/openembedded/meta-openembedded.git -b langdalecrash工具在openembedded中meta-oe layer中,需要把该layer添加到bblayer.conf中:BBLAYERS ?= " ... /home/lxq/Desktop/workspace/02.yocto/poky/meta-test /home/lxq/Desktop/workspace/02.yocto/poky/meta-openembedded/meta-oe "
查看是否正确:02.yocto/poky/rpi-build$ bitbake -s |grep crash crash :8.0.0-r0 conf/local.conf中添加crash工具:IMAGE_INSTALL:append = " kexec-tools makedumpfile crash oops" # add crash
crash地址:GitHub - crash-utility/crash: Linux kernel crash utility环境准备
上面步骤完成后可以重新编译镜像:$bitbake core-image-base-rpi -c cleansstate $bitbake core-image-base-rpi
刷机后确认环境是否ok:确认oops.ko是否存在,路径: /lib/modules/xxx/extra/root@raspberrypi4:~# ls /lib/modules/5.15.56-v7l/extra/ oops_test.ko确认crash工具是否存在。crash工具依赖vmlinux,需要把编译后的vmlinux复制到U盘中,如cp tmp/work/raspberrypi4-poky-linux-gnueabi/linux-raspberrypi/1_5.15.56+gitAUTOINC+3b1dc2f1fc_a90998a3e5-r0/linux-raspberrypi4-standard-build/vmlinux /media/lxq/root/home/当kernel发生oops默认不会触发panic,需要设置panic_on_oops,为0时不会触发panic,为1时能够触发。# 查看 root@raspberrypi4:~# sysctl -a |grep panic_on_oops kernel.panic_on_oops = 0 # 设置 root@raspberrypi4:~# sysctl kernel.panic_on_oops=1 kernel.panic_on_oops = 1现在可以通过insmod oops.ko触发panic:root@raspberrypi4:/lib/modules/5.15.56-v7l/extra# insmod oops_test.ko
通过串口确认是否成功: ... 477.870240] 1fc0: 00000000 00000000 018c6190 0000017b be82dbd8 004724a0 00000002 0048fdfc [ 477.878539] 1fe0: be82dba0 be82db90 0046d7c9 b6e89cd2 [ 477.883665] r6:018c6190 r5:00000000 r4:00000000 [ 477.888352] Code: e3040054 e34b0f18 eb7255ec e3a00000 (e5800000) [ 477.894561] Loading crashdump kernel... [ 477.898451] Bye! [ 0.000000] Bootingcrash工具分析
crash工具需要的文件都已经完成,可以进行分析:root@raspberrypi4:~# crash /home/vmlinux /var/crash/2018-03-09/vmcore-12:35:38 crash 8.0.0 ... strings: standard output: Broken pipe KERNEL: /home/vmlinux [TAINTED] DUMPFILE: /var/crash/2018-03-09/vmcore-12:35:38 CPUS: 1 DATE: Fri Mar 9 12:42:44 UTC 2018 UPTIME: 00:07:57 LOAD AVERAGE: 0.09, 0.07, 0.02 TASKS: 113 NODENAME: raspberrypi4 RELEASE: 5.15.56-v7l VERSION: #1 Fri Jul 22 13:23:23 UTC 2022 MACHINE: armv7l (unknown Mhz) MEMORY: 3.9 GB PANIC: "Unable to handle kernel NULL pointer dereference at virtual address 00000000" PID: 539 COMMAND: "insmod" TASK: c2c2adc0 [THREAD_INFO: c1ad0000] CPU: 0 STATE: TASK_RUNNING (PANIC) crash> bt|more PID: 539 TASK: c2c2adc0 CPU: 0 COMMAND: "insmod" #0 [] (_MODULE_INIT_START_oops_test [oops_test]) from [] #1 [] (do_one_initcall) from [] #2 [] (do_init_module) from [] #3 [] (load_module) from []
注:crash工具在arm中显示有些问题,命令后面需要追加more进行显示,否则会提示:/usr/bin/less: invalid option -- "P"