Путеводитель по Руководству Linux

  User  |  Syst  |  Libr  |  Device  |  Files  |  Other  |  Admin  |  Head  |



   perf-intel-pt    ( 1 )

поддержка Intel Processor Trace в инструментах perf (Support for Intel Processor Trace within perf tools)

  Name  |  Synopsis  |  Description  |  Quickstart  |  Perf record  |  Perf script  |  Perf report  |  Perf inject  |  Pebs via intel pt  |  Xed  |    Tracing virtual machines    |  See also  |

TRACING VIRTUAL MACHINES

Currently, only kernel tracing is supported and only with either
       "timeless" decoding (i.e. no TSC timestamps) or VM Time
       Correlation. VM Time Correlation is an extra step using perf
       inject and requires unchanging VMX TSC Offset and no VMX TSC
       Scaling.

Other limitations and caveats

VMX controls may suppress packets needed for decoding resulting in decoding errors VMX controls may block the perf NMI to the host potentially resulting in lost trace data Guest kernel self-modifying code (e.g. jump labels or JIT-compiled eBPF) will result in decoding errors Guest thread information is unknown Guest VCPU is unknown but may be able to be inferred from the host thread Callchains are not supported

Example using "timeless" decoding

Start VM

$ sudo virsh start kubuntu20.04 Domain kubuntu20.04 started

Mount the guest file system. Note sshfs needs -o direct_io to enable reading of proc files. root access is needed to read /proc/kcore.

$ mkdir vm0 $ sshfs -o direct_io root@vm0:/ vm0

Copy the guest /proc/kallsyms, /proc/modules and /proc/kcore

$ perf buildid-cache -v --kcore vm0/proc/kcore kcore added to build-id cache directory /home/user/.debug/[kernel.kcore]/9600f316a53a0f54278885e8d9710538ec5f6a08/2021021807494306 $ KALLSYMS=/home/user/.debug/[kernel.kcore]/9600f316a53a0f54278885e8d9710538ec5f6a08/2021021807494306/kallsyms

Find the VM process

$ ps -eLl | grep 'KVM\|PID' F S UID PID PPID LWP C PRI NI ADDR SZ WCHAN TTY TIME CMD 3 S 64055 1430 1 1440 1 80 0 - 1921718 - ? 00:02:47 CPU 0/KVM 3 S 64055 1430 1 1441 1 80 0 - 1921718 - ? 00:02:41 CPU 1/KVM 3 S 64055 1430 1 1442 1 80 0 - 1921718 - ? 00:02:38 CPU 2/KVM 3 S 64055 1430 1 1443 2 80 0 - 1921718 - ? 00:03:18 CPU 3/KVM

Start an open-ended perf record, tracing the VM process, do something on the VM, and then ctrl-C to stop. TSC is not supported and tsc=0 must be specified. That means mtc is useless, so add mtc=0. However, IPC can still be determined, hence cyc=1 can be added. Only kernel decoding is supported, so k must be specified. Intel PT traces both the host and the guest so --guest and --host need to be specified. Without timestamps, --per-thread must be specified to distinguish threads.

$ sudo perf kvm --guest --host --guestkallsyms $KALLSYMS record --kcore -e intel_pt/tsc=0,mtc=0,cyc=1/k -p 1430 --per-thread ^C [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 5.829 MB ]

perf script can be used to provide an instruction trace

$ perf script --guestkallsyms $KALLSYMS --insn-trace --xed -F+ipc | grep -C10 vmresume | head -21 CPU 0/KVM 1440 ffffffff82133cdd __vmx_vcpu_run+0x3d ([kernel.kallsyms]) movq 0x48(%rax), %r9 CPU 0/KVM 1440 ffffffff82133ce1 __vmx_vcpu_run+0x41 ([kernel.kallsyms]) movq 0x50(%rax), %r10 CPU 0/KVM 1440 ffffffff82133ce5 __vmx_vcpu_run+0x45 ([kernel.kallsyms]) movq 0x58(%rax), %r11 CPU 0/KVM 1440 ffffffff82133ce9 __vmx_vcpu_run+0x49 ([kernel.kallsyms]) movq 0x60(%rax), %r12 CPU 0/KVM 1440 ffffffff82133ced __vmx_vcpu_run+0x4d ([kernel.kallsyms]) movq 0x68(%rax), %r13 CPU 0/KVM 1440 ffffffff82133cf1 __vmx_vcpu_run+0x51 ([kernel.kallsyms]) movq 0x70(%rax), %r14 CPU 0/KVM 1440 ffffffff82133cf5 __vmx_vcpu_run+0x55 ([kernel.kallsyms]) movq 0x78(%rax), %r15 CPU 0/KVM 1440 ffffffff82133cf9 __vmx_vcpu_run+0x59 ([kernel.kallsyms]) movq (%rax), %rax CPU 0/KVM 1440 ffffffff82133cfc __vmx_vcpu_run+0x5c ([kernel.kallsyms]) callq 0xffffffff82133c40 CPU 0/KVM 1440 ffffffff82133c40 vmx_vmenter+0x0 ([kernel.kallsyms]) jz 0xffffffff82133c46 CPU 0/KVM 1440 ffffffff82133c42 vmx_vmenter+0x2 ([kernel.kallsyms]) vmresume IPC: 0.11 (50/445) :1440 1440 ffffffffbb678b06 native_write_msr+0x6 ([guest.kernel.kallsyms]) nopl %eax, (%rax,%rax,1) :1440 1440 ffffffffbb678b0b native_write_msr+0xb ([guest.kernel.kallsyms]) retq IPC: 0.04 (2/41) :1440 1440 ffffffffbb666646 lapic_next_deadline+0x26 ([guest.kernel.kallsyms]) data16 nop :1440 1440 ffffffffbb666648 lapic_next_deadline+0x28 ([guest.kernel.kallsyms]) xor %eax, %eax :1440 1440 ffffffffbb66664a lapic_next_deadline+0x2a ([guest.kernel.kallsyms]) popq %rbp :1440 1440 ffffffffbb66664b lapic_next_deadline+0x2b ([guest.kernel.kallsyms]) retq IPC: 0.16 (4/25) :1440 1440 ffffffffbb74607f clockevents_program_event+0x8f ([guest.kernel.kallsyms]) test %eax, %eax :1440 1440 ffffffffbb746081 clockevents_program_event+0x91 ([guest.kernel.kallsyms]) jz 0xffffffffbb74603c IPC: 0.06 (2/30) :1440 1440 ffffffffbb74603c clockevents_program_event+0x4c ([guest.kernel.kallsyms]) popq %rbx :1440 1440 ffffffffbb74603d clockevents_program_event+0x4d ([guest.kernel.kallsyms]) popq %r12

Example using VM Time Correlation

Start VM

$ sudo virsh start kubuntu20.04 Domain kubuntu20.04 started

Mount the guest file system. Note sshfs needs -o direct_io to enable reading of proc files. root access is needed to read /proc/kcore.

$ mkdir -p vm0 $ sshfs -o direct_io root@vm0:/ vm0

Copy the guest /proc/kallsyms, /proc/modules and /proc/kcore

$ perf buildid-cache -v --kcore vm0/proc/kcore same kcore found in /home/user/.debug/[kernel.kcore]/cc9c55a98c5e4ec0aeda69302554aabed5cd6491/2021021312450777 $ KALLSYMS=/home/user/.debug/\[kernel.kcore\]/cc9c55a98c5e4ec0aeda69302554aabed5cd6491/2021021312450777/kallsyms

Find the VM process

$ ps -eLl | grep 'KVM\|PID' F S UID PID PPID LWP C PRI NI ADDR SZ WCHAN TTY TIME CMD 3 S 64055 16998 1 17005 13 80 0 - 1818189 - ? 00:00:16 CPU 0/KVM 3 S 64055 16998 1 17006 4 80 0 - 1818189 - ? 00:00:05 CPU 1/KVM 3 S 64055 16998 1 17007 3 80 0 - 1818189 - ? 00:00:04 CPU 2/KVM 3 S 64055 16998 1 17008 4 80 0 - 1818189 - ? 00:00:05 CPU 3/KVM

Start an open-ended perf record, tracing the VM process, do something on the VM, and then ctrl-C to stop. IPC can be determined, hence cyc=1 can be added. Only kernel decoding is supported, so k must be specified. Intel PT traces both the host and the guest so --guest and --host need to be specified.

$ sudo perf kvm --guest --host --guestkallsyms $KALLSYMS record --kcore -e intel_pt/cyc=1/k -p 16998 ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 9.041 MB perf.data.kvm ]

Now perf inject can be used to determine the VMX TCS Offset. Note, Intel PT TSC packets are only 7-bytes, so the TSC Offset might differ from the actual value in the 8th byte. That will have no effect i.e. the resulting timestamps will be correct anyway.

$ perf inject -i perf.data.kvm --vm-time-correlation=dry-run ERROR: Unknown TSC Offset for VMCS 0x1bff6a VMCS: 0x1bff6a TSC Offset 0xffffe42722c64c41 ERROR: Unknown TSC Offset for VMCS 0x1cbc08 VMCS: 0x1cbc08 TSC Offset 0xffffe42722c64c41 ERROR: Unknown TSC Offset for VMCS 0x1c3ce8 VMCS: 0x1c3ce8 TSC Offset 0xffffe42722c64c41 ERROR: Unknown TSC Offset for VMCS 0x1cbce9 VMCS: 0x1cbce9 TSC Offset 0xffffe42722c64c41

Each virtual CPU has a different Virtual Machine Control Structure (VMCS) shown above with the calculated TSC Offset. For an unchanging TSC Offset they should all be the same for the same virtual machine.

Now that the TSC Offset is known, it can be provided to perf inject

$ perf inject -i perf.data.kvm --vm-time-correlation="dry-run 0xffffe42722c64c41"

Note the options for perf inject --vm-time-correlation are:

[ dry-run ] [ <TSC Offset> [ : <VMCS> [ , <VMCS> ]... ] ]...

So it is possible to specify different TSC Offsets for different VMCS. The option "dry-run" will cause the file to be processed but without updating it. Note it is also possible to get a intel_pt.log file by adding option --itrace=d

There were no errors so, do it for real

$ perf inject -i perf.data.kvm --vm-time-correlation=0xffffe42722c64c41 --force

perf script can be used to see if there are any decoder errors

$ perf script -i perf.data.kvm --guestkallsyms $KALLSYMS --itrace=e-o

There were none.

perf script can be used to provide an instruction trace showing timestamps

$ perf script -i perf.data.kvm --guestkallsyms $KALLSYMS --insn-trace --xed -F+ipc | grep -C10 vmresume | head -21 CPU 1/KVM 17006 [001] 11500.262865593: ffffffff82133cdd __vmx_vcpu_run+0x3d ([kernel.kallsyms]) movq 0x48(%rax), %r9 CPU 1/KVM 17006 [001] 11500.262865593: ffffffff82133ce1 __vmx_vcpu_run+0x41 ([kernel.kallsyms]) movq 0x50(%rax), %r10 CPU 1/KVM 17006 [001] 11500.262865593: ffffffff82133ce5 __vmx_vcpu_run+0x45 ([kernel.kallsyms]) movq 0x58(%rax), %r11 CPU 1/KVM 17006 [001] 11500.262865593: ffffffff82133ce9 __vmx_vcpu_run+0x49 ([kernel.kallsyms]) movq 0x60(%rax), %r12 CPU 1/KVM 17006 [001] 11500.262865593: ffffffff82133ced __vmx_vcpu_run+0x4d ([kernel.kallsyms]) movq 0x68(%rax), %r13 CPU 1/KVM 17006 [001] 11500.262865593: ffffffff82133cf1 __vmx_vcpu_run+0x51 ([kernel.kallsyms]) movq 0x70(%rax), %r14 CPU 1/KVM 17006 [001] 11500.262865593: ffffffff82133cf5 __vmx_vcpu_run+0x55 ([kernel.kallsyms]) movq 0x78(%rax), %r15 CPU 1/KVM 17006 [001] 11500.262865593: ffffffff82133cf9 __vmx_vcpu_run+0x59 ([kernel.kallsyms]) movq (%rax), %rax CPU 1/KVM 17006 [001] 11500.262865593: ffffffff82133cfc __vmx_vcpu_run+0x5c ([kernel.kallsyms]) callq 0xffffffff82133c40 CPU 1/KVM 17006 [001] 11500.262865593: ffffffff82133c40 vmx_vmenter+0x0 ([kernel.kallsyms]) jz 0xffffffff82133c46 CPU 1/KVM 17006 [001] 11500.262866075: ffffffff82133c42 vmx_vmenter+0x2 ([kernel.kallsyms]) vmresume IPC: 0.05 (40/769) :17006 17006 [001] 11500.262869216: ffffffff82200cb0 asm_sysvec_apic_timer_interrupt+0x0 ([guest.kernel.kallsyms]) clac :17006 17006 [001] 11500.262869216: ffffffff82200cb3 asm_sysvec_apic_timer_interrupt+0x3 ([guest.kernel.kallsyms]) pushq $0xffffffffffffffff :17006 17006 [001] 11500.262869216: ffffffff82200cb5 asm_sysvec_apic_timer_interrupt+0x5 ([guest.kernel.kallsyms]) callq 0xffffffff82201160 :17006 17006 [001] 11500.262869216: ffffffff82201160 error_entry+0x0 ([guest.kernel.kallsyms]) cld :17006 17006 [001] 11500.262869216: ffffffff82201161 error_entry+0x1 ([guest.kernel.kallsyms]) pushq %rsi :17006 17006 [001] 11500.262869216: ffffffff82201162 error_entry+0x2 ([guest.kernel.kallsyms]) movq 0x8(%rsp), %rsi :17006 17006 [001] 11500.262869216: ffffffff82201167 error_entry+0x7 ([guest.kernel.kallsyms]) movq %rdi, 0x8(%rsp) :17006 17006 [001] 11500.262869216: ffffffff8220116c error_entry+0xc ([guest.kernel.kallsyms]) pushq %rdx :17006 17006 [001] 11500.262869216: ffffffff8220116d error_entry+0xd ([guest.kernel.kallsyms]) pushq %rcx :17006 17006 [001] 11500.262869216: ffffffff8220116e error_entry+0xe ([guest.kernel.kallsyms]) pushq %rax