作者
Hank FU 付汉杰 hankf@xilinx.com
测试环境
Xilinx ZCU106 单板
Xilinx VCU TRD2020.1
介绍
嵌入式Linux系统中,Linux直接管理所有CPU。默认情况下,系统的目标是提高吞吐率,而不是实时性。为了保证实时性,可以根据应用场景,对CPU实行更加精确的控制。常见的办法有,进程CPU隔离、CPU亲和、中断CPU亲和、进程优先级。
工具
嵌入式Linux系统中,一般使用busybox中的ps、top等工具。它们小巧,但是功能有限。如果需要更强大的工具,可以从Ubuntu文件系统ubuntu-base-20.04.1-base-arm64.tar.gz中提取。
本测试中,从Ubuntu文件系统提取了ps、top等工具,并改名为u-ps、u-top,以和busybox中的ps、top区别。
CPU隔离
Linux还是可能把一个进程调度到任意一个CPU上,从而导致普通进程影响实时进程的性能。可以采用Linux内核的命令行参数isolcpus,实现CPU隔离,完全禁止Linux调度进程到某些CPU上,从而保证实时进程的响应时间。
在U-Boot下,执行下列命令,可以使Linux不再调度进程到CPU2和CPU3上。
setenv bootargs "earlycon clk_ignore_unused consoleblank=0 cma=1700M uio_pdrv_genirq.of_id=generic-uio isolcpus=2,3"
Linux启动后,可以通过命令“cat /proc/cmdline” 查看Linux内核的命令行参数。
root@vcu_trd:~# cat /proc/cmdline
earlycon clk_ignore_unused consoleblank=0 cma=1700M uio_pdrv_genirq.of_id=generic-uio isolcpus=2,3
再使用Ubuntu文件系统中的ps工具的psr选项,查看系统所有进程运行的CPU。下面输出的第二列,就是CPU号。可以看到,大部分进程,运行在在CPU0和CPU1上。运行在CPU2和CPU3上的进程,都是通过taskset设置了CPU亲和的进程。其中的u-ps是来自于软件包[ubuntu-base-20.04.1-base-arm64.tar.gz]。
root@vcu_trd:~# u-ps -axo pid,psr,cmd,ni
PID PSR CMD NI
1 1 init 0
2 1 [kthreadd] 0
3 0 [rcu_gp] -20
4 0 [rcu_par_gp] -20
6 0 [kworker/0:0H-kblockd] -20
8 0 [mm_percpu_wq] -20
9 0 [ksoftirqd/0] 0
10 1 [rcu_sched] 0
11 0 [migration/0] -
12 0 [cpuhp/0] 0
13 1 [cpuhp/1] 0
14 1 [migration/1] -
15 1 [ksoftirqd/1] 0
17 1 [kworker/1:0H-kblockd] -20
18 2 [cpuhp/2] 0
19 2 [migration/2] -
20 2 [ksoftirqd/2] 0
21 2 [kworker/2:0-events] 0
22 2 [kworker/2:0H] -20
23 3 [cpuhp/3] 0
24 3 [migration/3] -
25 3 [ksoftirqd/3] 0
26 3 [kworker/3:0-events] 0
27 3 [kworker/3:0H] -20
28 0 [kdevtmpfs] 0
29 1 [netns] -20
30 1 [kauditd] 0
32 1 [oom_reaper] 0
33 0 [writeback] -20
34 1 [kcompactd0] 0
35 1 [khugepaged] 19
37 0 [kworker/0:1-events] 0
38 1 [kworker/u8:1-events_unboun 0
87 1 [kblockd] -20
88 0 [blkcg_punt_bio] -20
89 0 [edac-poller] -20
90 1 [watchdogd] -
91 1 [rpciod] -20
92 0 [kworker/u9:0] -20
93 1 [xprtiod] -20
94 1 [cfg80211] -20
95 1 [kswapd0] 0
96 1 [ecryptfs-kthrea] 0
97 0 [nfsiod] -20
100 0 [irq/60-a00d0000] -
107 1 [ion_system_heap] -
108 2 [irq/61-a00d1000] -
109 0 [kpktgend_0] 0
110 1 [kpktgend_1] 0
111 2 [kpktgend_2] 0
112 3 [kpktgend_3] 0
113 1 [ipv6_addrconf] -20
114 0 [krfcommd] -10
115 2 [kworker/2:1-events] 0
116 3 [kworker/3:1-events] 0
117 0 [kworker/0:2-events_power_e 0
118 0 [irq/47-fd4a0000] -
120 1 [scsi_eh_0] 0
121 1 [scsi_tmf_0] -20
122 1 [scsi_eh_1] 0
123 1 [scsi_tmf_1] -20
125 1 [spi0] 0
128 1 [sdhci] -20
129 0 [irq/41-mmc0] -
134 0 [mmc_complete] -20
135 0 [kworker/0:1H-mmc_complete] -20
165 1 [kworker/1:1H-kblockd] -20
181 0 /sbin/udevd -d 0
231 1 [irq/63-xilinx-v] -
238 1 [xilinx-hdmi-rx] -20
242 1 [irq/54-xilinx-h] -
246 1 [irq/52-xilinx-h] -
419 0 [irq/62-a0220000] -
420 2 [irq/62-a0200000] -
808 0 udhcpc -R -b -p /var/run/ud 0
815 1 /usr/bin/dbus-daemon --syst 0
818 0 /usr/sbin/haveged -w 1024 - 0
827 1 xinit /etc/X11/Xsession -- 0
831 0 /usr/bin/Xorg :0 -br -pn -1
836 1 matchbox-window-manager -th 0
841 1 dbus-launch --sh-syntax --e 0
842 1 /usr/bin/dbus-daemon --sysl 0
862 0 /usr/sbin/dropbear -r /etc/ 0
864 0 /usr/libexec/at-spi-bus-lau 0
874 0 /usr/libexec/gconfd-2 0
882 0 /usr/bin/dbus-daemon --conf 0
885 1 /usr/sbin/inetd 0
888 0 /usr/bin/settings-daemon 0
898 0 /sbin/syslogd -n -O /var/lo 0
901 0 /sbin/klogd -n 0
910 0 matchbox-desktop 0
911 0 matchbox-panel --start-appl 0
921 0 /usr/sbin/tcf-agent -d -L- 0
929 0 /bin/sh /bin/start_getty 11 0
930 0 /sbin/getty 38400 tty1 0
934 1 /usr/libexec/at-spi2-regist 0
940 1 /usr/sbin/console-kit-daemo 0
1025 0 /bin/login -- 0
1050 1 -sh 0
1055 0 /usr/sbin/dropbear -r /etc/ 0
1057 0 -sh 0
1063 0 /usr/sbin/dropbear -r /etc/ 0
1065 0 -sh 0
1071 0 top 0
7174 0 /usr/sbin/dropbear -r /etc/ 0
7176 0 -sh 0
22378 0 /usr/sbin/dropbear -r /etc/ 0
22380 0 -sh 0
22588 0 /usr/sbin/dropbear -r /etc/ 0
22590 0 -sh 0
22600 1 [kworker/1:0-events] 0
22601 1 [kworker/u8:0-events_unboun 0
22602 1 [kworker/1:2-events_power_e 0
22603 1 [kworker/u8:2-events_unboun 0
22606 0 u-ps -axo pid,psr,cmd,ni 0
进程CPU亲和
设置进程CPU亲和时,需要知道进程号(PID)。ps和top等工具,可以查看进程号(PID)。
工具taskset可以查看和控制进程的CPU亲和。通过‘-p选项,指定进程号(PID),可以查看对应进程的CPU亲和。
root@vcu_trd:~# taskset -p 815
pid 815's current affinity mask: 1
采用如下脚本,可以检查所有进程的CPU亲和。
#!/bin/sh
u-ps -axo pid,psr,cmd,ni | grep -v "gst" | grep -v "xilinx" | grep -v "irq" | grep -v "kworker" | grep -v "grep" | grep -v "awk" | awk '{print $1}' > process_list.txt
echo -e "
Read process list file:" ;
cat process_list.txt | while read line
do
# echo "CPU affinity for process ID: $line"
taskset -p $line
done
Linux系统中进程数量繁多,也可以采用如下脚本,设置所有进程的CPU亲和。
#!/bin/sh
u-ps -axo pid,psr,cmd,ni | grep -v "grep" | grep -v "awk" | awk '{print $1}' > process_list.txt
cat process_list.txt | while read line
do
echo -e "
Check process ID: $line"
if [ $line -gt 500 ]; then
# echo "Original CPU affinity for process ID: $line"
# taskset -p $line
echo "Set priority for process ID: $line"
taskset -a -p 1 $line
# echo "New CPU affinity for process ID: $line"
# taskset -p $line
fi
done
对于新的任务,可以在启动时,就指定进程CPU亲和。taskset的帮助信息如下:
taskset [options] [mask | cpu-list] [pid|cmd [args...]]
如果要指定进程CPU亲和,可以采用下列命令启动新的任务。
taskset -a cpu-list cmd
比如以命令“ taskset -a 8 top”执行top,可以看到它确实运行在CPU-3上。
root@vcu_trd:~# u-ps -axo pid,psr,cmd,ni | grep top | grep -v grep | grep -v match
22629 3 top 0
中断CPU亲和
默认情况下,Linux使用CPU0处理普通外设的中断。通过更改/proc/irq/irq_number/smp_affinity,可以改变处理中断的CPU。也可以查看/proc/interrupts,显示系统中各个CPU处理的中断数量。
Linux系统中的中断也很多,也可以采用如下脚本,设置所有中断的CPU亲和。中断和CPU的对应关系,可以根据场景更改。
#!/bin/sh
cat /proc/interrupts > interrupts_list_all.txt
cat /proc/interrupts | grep -v "CPU" | grep -v "IPI" | grep -v "Err" | awk '{print $1}' > interrupts_list.txt
echo -e "
Read interrupts list file:" ;
cat interrupts_list.txt | while read line
do
# remove colon :
line_new=${line/:/}
echo -e "
Check interrupt: $line_new"
ls -l -h /proc/irq/$line_new/smp_affinity
# 48: GICv2 122 Level xilinx_framebuffer
# 52: GICv2 123 Level xilinx-hdmi-rx
# 54: GICv2 125 Level xilinx-hdmitxss
# 55: GICv2 127 Level xlnx-mixer
# 61: GICv2 139 Level a00d1000.sync_ip
# 62: GICv2 128 Level a0200000.al5e, a0220000.al5d
# 63: GICv2 124 Level xilinx-vphy
if [ $line_new -eq 48 ]; then
echo -e "
Set CPU:1 affinity for interrupt: $line_new"
echo 2 > /proc/irq/$line_new/smp_affinity
elif [ $line_new -eq 52 ]; then
echo -e "
Set CPU:1 affinity for interrupt: $line_new"
echo 2 > /proc/irq/$line_new/smp_affinity
elif [ $line_new -eq 54 ]; then
echo -e "
Set CPU:1 affinity for interrupt: $line_new"
echo 2 > /proc/irq/$line_new/smp_affinity
elif [ $line_new -eq 55 ]; then
echo -e "
Set CPU:1 affinity for interrupt: $line_new"
echo 2 > /proc/irq/$line_new/smp_affinity
elif [ $line_new -eq 61 ]; then
echo -e "
Set CPU:2 affinity for interrupt: $line_new"
echo 4 > /proc/irq/$line_new/smp_affinity
elif [ $line_new -eq 62 ]; then
echo -e "
Set CPU:2 affinity for interrupt: $line_new"
echo 4 > /proc/irq/$line_new/smp_affinity
elif [ $line_new -eq 63 ]; then
echo -e "
Set CPU:2 affinity for interrupt: $line_new"
echo 2 > /proc/irq/$line_new/smp_affinity
else
echo -e "
Set CPU:0 affinity for interrupt: $line_new"
echo 1 > /proc/irq/$line_new/smp_affinity
fi
echo -e "
New CPU affinity for interrupt: $line_new"
cat /proc/irq/$line_new/smp_affinity
done
设置中断后,查看/proc/interrupts,可以看到CPU2/CPU3,处理了中断48、52、54、55、61、62。
root@vcu_trd:~# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
3: 115462 135783 31811 204151 GICv2 30 Level arch_timer
6: 0 0 0 0 GICv2 67 Level zynqmp_ipi
7: 0 0 0 0 GICv2 175 Level arm-pmu
8: 0 0 0 0 GICv2 176 Level arm-pmu
9: 0 0 0 0 GICv2 177 Level arm-pmu
10: 0 0 0 0 GICv2 178 Level arm-pmu
12: 349750 0 0 0 GICv2 156 Level zynqmp-dma
13: 0 0 0 0 GICv2 157 Level zynqmp-dma
14: 0 0 0 0 GICv2 158 Level zynqmp-dma
15: 0 0 0 0 GICv2 159 Level zynqmp-dma
16: 0 0 0 0 GICv2 160 Level zynqmp-dma
17: 0 0 0 0 GICv2 161 Level zynqmp-dma
18: 0 0 0 0 GICv2 162 Level zynqmp-dma
19: 0 0 0 0 GICv2 163 Level zynqmp-dma
20: 0 0 0 0 GICv2 164 Level Mali_GP_MMU, Mali_GP, Mali_PP0_MMU, Mali_PP0, Mali_PP1_MMU, Mali_PP1
21: 0 0 0 0 GICv2 109 Level zynqmp-dma
22: 0 0 0 0 GICv2 110 Level zynqmp-dma
23: 0 0 0 0 GICv2 111 Level zynqmp-dma
24: 0 0 0 0 GICv2 112 Level zynqmp-dma
25: 0 0 0 0 GICv2 113 Level zynqmp-dma
26: 0 0 0 0 GICv2 114 Level zynqmp-dma
27: 0 0 0 0 GICv2 115 Level zynqmp-dma
28: 0 0 0 0 GICv2 116 Level zynqmp-dma
30: 463312 0 0 0 GICv2 95 Level eth0, eth0
32: 525 0 0 0 GICv2 49 Level cdns-i2c
33: 113 0 0 0 GICv2 50 Level cdns-i2c
34: 0 0 0 0 GICv2 42 Level ff960000.memory-controller
35: 0 0 0 0 GICv2 57 Level axi-pmon, axi-pmon
36: 181 0 0 0 GICv2 155 Level axi-pmon, axi-pmon
37: 28 0 0 0 GICv2 47 Level ff0f0000.spi
38: 0 0 0 0 GICv2 58 Level ffa60000.rtc
39: 0 0 0 0 GICv2 59 Level ffa60000.rtc
40: 0 0 0 0 GICv2 165 Level ahci-ceva[fd0c0000.ahci]
41: 233 0 0 0 GICv2 81 Level mmc0
42: 133 0 0 0 GICv2 53 Level xuartps
44: 0 0 0 0 GICv2 84 Edge ff150000.watchdog
45: 0 0 0 0 GICv2 88 Level ams-irq
46: 12 0 0 0 GICv2 154 Level fd4c0000.dma
47: 0 0 0 0 GICv2 151 Level fd4a0000.zynqmp-display
48: 0 34920 0 0 GICv2 122 Level xilinx_framebuffer
49: 0 0 0 0 GICv2 141 Level xilinx_framebuffer
50: 0 0 0 0 GICv2 142 Level xilinx_framebuffer
51: 0 0 0 0 GICv2 143 Level xilinx_framebuffer
52: 0 1142094 0 0 GICv2 123 Level xilinx-hdmi-rx
53: 0 0 0 0 GICv2 121 Level xilinx_framebuffer
54: 17669 151552 0 0 GICv2 125 Level xilinx-hdmitxss
55: 17672 151552 0 0 GICv2 127 Level xlnx-mixer
56: 0 0 0 0 GICv2 136 Level xilinx-dma-controller
57: 0 0 0 0 GICv2 137 Level xilinx-dma-controller
58: 0 0 0 0 GICv2 138 Level xilinx-dma-controller
59: 0 0 0 0 GICv2 140 Level xilinx-dma-controller
60: 81 0 0 0 GICv2 126 Level a00d0000.i2c
61: 0 0 69841 0 GICv2 139 Level a00d1000.sync_ip
62: 4 0 279353 0 GICv2 128 Level a0220000.al5d, a0200000.al5e
63: 1184 163 0 0 GICv2 124 Level xilinx-vphy
64: 0 0 0 0 GICv2 97 Level xhci-hcd:usb1
67: 0 0 0 0 zynq-gpio 22 Edge sw19
IPI0: 64845 46081 35 663483 Rescheduling interrupts
IPI1: 19 58 29 29 Function call interrupts
IPI2: 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts
IPI4: 0 0 0 0 Timer broadcast interrupts
IPI5: 0 0 0 0 IRQ work interrupts
IPI6: 0 0 0 0 CPU wake-up interrupts
Err: 0
进程优先级
Linux下进程的优先级概念比较复杂,而且攻击top和"ps -l"显示的值还不一样。简单来说,Linux进程的优先级范围是0-139,值越低优先级越高。实时进程的优先级范围是0-99,普通进程的优先级范围是100-139。实时进程的优先级总是高于普通进程。
NICE值是普通进程的“礼貌”值,其实是优先级的相反值,范围是-20至19,对应到实际的优先级范围是100-139。
对于普通进程,可以通过工具renice设置进程的Nice值,来更改进程的优先级。Nice值越大,改进程的优先级越低。renice的常用格式为renice PRIORITY -p pid。其中PRIORITY是Nice值,pid是进程ID。
下面的脚本,可以把所有名字中含有关键字(脚本第一个参数,$1)的进程的优先级设置为第二个参数($2)的值。
#!/bin/sh
u-ps -axo pid,psr,cmd,ni
u-ps -axo pid,psr,cmd,ni | grep $1 | grep -v "grep" | grep -v "awk" | awk '{print $1}' > process_list.txt
echo -e "
Read process list file:" ;
cat process_list.txt | while read line
do
echo -e "
Set PID: $line to priority-nice value: $2
"
renice $2 -p $line
done
如果普通进程的优先级不满足要求,还可以使用chrt把进程设置为实时进程,命令格式是“chrt -p priority {pid}”。比如命令“chrt -p 50 28195“ 把进程ID为28195的进程优先级设置为50。
其它
如果为了跟进一步提高实时性能,可以考虑为Linux内核增加Linux RT Patch。