Skip to the content.

使用 QEMU 和 ebpf 阅读内核

上一篇文章中,提出了学习内核的目的,工作的方向以及大致的学习路径,下面谈一下可以提高分析效率的工具。

QEMU

使用 QEMU 调试内核网上已经有很多的文章, 比如 Booting a Custom Linux Kernel in QEMU and Debugging It With GDB

QEMU 很强大,但是 QEMU 的参数对于新手不是很友好,我写了一个QEMU 的基本使用方法, 关于调试可以参考这个脚本

FlameGraph

使用 FlameGraph 可以很快的定位出来函数的大致执行的流程,无需使用编辑器一个个的跳转,非常好用。 其局限在于,似乎总是只能从用户态系统调用触发。

具体细节参考 https://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html

perf 可能需要安装:

sudo apt install linux-tools-common linux-tools-generic linux-tools-`uname -r`

最终效果如下,可以在新的窗口中打开从而可以动态交互。

这个工具我使用的非常频繁,所以构建了简单的一个脚本,例如:

./flamegraph.sh -c 'iperf -c localhost' -g iperf

使用 BCC 也是不错的

bpftrace

使用 bpftrace 的 kprobe 可以很容易的动态的获取内核函数的 backtrace

首先可以使用 sudo bpftrace -l fzf 检查可以观测的点

脚本:

#!/usr/bin/bpftrace
kprobe:task_tick_fair
{
  @[kstack] = count();
}

结果:

@[
    task_tick_fair+1
    update_process_times+187
    tick_sched_handle.isra.0+37
    tick_sched_timer+109
    __hrtimer_run_queues+251
    hrtimer_interrupt+265
    __sysvec_apic_timer_interrupt+100
    sysvec_apic_timer_interrupt+56
    asm_sysvec_apic_timer_interrupt+18
]: 171

debug 目录

[ ] gcov

perf-scripts

没有 stackcount,但是相对来说,已经很好用了

观测内核

整理一下一个模块专用的内核调试工具

似乎 stackcount 和 bpftrace 的效果还是存在差别的

🧀  sudo stackcount find_vma
[sudo] password for martins3:
In file included from /virtual/main.c:1:
In file included from include/uapi/linux/ptrace.h:134:
In file included from arch/x86/include/asm/ptrace.h:5:
In file included from arch/x86/include/asm/segment.h:7:
arch/x86/include/asm/ibt.h:55:8: warning: 'nocf_check' attribute ignored; use -fcf-protection to enable the attribute [-Wignored-attributes]
extern __noendbr u64 ibt_save(void);
       ^
arch/x86/include/asm/ibt.h:20:34: note: expanded from macro '__noendbr'
#define __noendbr       __attribute__((nocf_check))
                                       ^
arch/x86/include/asm/ibt.h:56:8: warning: 'nocf_check' attribute ignored; use -fcf-protection to enable the attribute [-Wignored-attributes]
extern __noendbr void ibt_restore(u64 save);
       ^
arch/x86/include/asm/ibt.h:20:34: note: expanded from macro '__noendbr'
#define __noendbr       __attribute__((nocf_check))
                                       ^
2 warnings generated.
Tracing 1 functions for "find_vma"... Hit Ctrl-C to end.
^C
  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  x64_setup_rt_frame
  arch_do_signal_or_restart
  exit_to_user_mode_prepare
  syscall_exit_to_user_mode
  do_syscall_64
  entry_SYSCALL_64_after_hwframe
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
    1

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
    2

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
    2

  find_vma
  do_user_addr_fault
  exc_page_fault
  asm_exc_page_fault
  [unknown]
  [unknown]
    3

Detaching...
vn on  master [!+⇡] took 3s
🧀  t find_vma
find_vma
+ [[ false == true ]]
+ sudo bpftrace -e 'kprobe:find_vma {  @[kstack] = count(); }'
Attaching 1 probe...
^C

@[
    find_vma+5
    do_user_addr_fault+349
    exc_page_fault+107
    asm_exc_page_fault+38
    copy_user_enhanced_fast_string+14
    _copy_to_iter+197
    copy_page_to_iter+138
    filemap_read+428
    vfs_read+517
    __x64_sys_pread64+148
    do_syscall_64+60
    entry_SYSCALL_64_after_hwframe+114
]: 1
@[
    find_vma+5
    do_madvise.part.0+560
    __x64_sys_madvise+94
    do_syscall_64+60
    entry_SYSCALL_64_after_hwframe+114
]: 1
@[
    find_vma+5
    do_user_addr_fault+349
    exc_page_fault+107
    asm_exc_page_fault+38
    copy_user_enhanced_fast_string+14
    _copy_to_iter+488
    __skb_datagram_iter+409
    skb_copy_datagram_iter+55
    unix_stream_read_actor+30
    unix_stream_read_generic+748
    unix_stream_recvmsg+140
    ____sys_recvmsg+135
    ___sys_recvmsg+124
    __sys_recvmsg+90
    do_syscall_64+60
    entry_SYSCALL_64_after_hwframe+114
]: 3
@[
    find_vma+5
    do_user_addr_fault+349
    exc_page_fault+107
    asm_exc_page_fault+38
]: 239