`perf-intel-pt` ( 1 )

поддержка Intel Processor Trace в инструментах perf (Support for Intel Processor Trace within perf tools)

PERF SCRIPT

Формат

By default, perf script will decode trace data found in the perf.data file. This can be further controlled by new option --itrace.

New --itrace option Having no option is the same as

--itrace

which, in turn, is the same as

--itrace=cepwx

The letters are:

i synthesize "instructions" events b synthesize "branches" events x synthesize "transactions" events w synthesize "ptwrite" events p synthesize "power" events (incl. PSB events) c synthesize branches events (calls only) r synthesize branches events (returns only) e synthesize tracing error events d create a debug log g synthesize a call chain (use with i or x) G synthesize a call chain on existing event records l synthesize last branch entries (use with i or x) L synthesize last branch entries on existing event records s skip initial number of events q quicker (less detailed) decoding Z prefer to ignore timestamps (so-called "timeless" decoding)

"Instructions" events look like they were recorded by "perf record -e instructions".

"Branches" events look like they were recorded by "perf record -e branches". "c" and "r" can be combined to get calls and returns.

"Transactions" events correspond to the start or end of transactions. The flags field can be used in perf script to determine whether the event is a tranasaction start, commit or abort.

Note that "instructions", "branches" and "transactions" events depend on code flow packets which can be disabled by using the config term "branch=0". Refer to the config terms section above.

"ptwrite" events record the payload of the ptwrite instruction and whether "fup_on_ptw" was used. "ptwrite" events depend on PTWRITE packets which are recorded only if the "ptw" config term was used. Refer to the config terms section above. perf script "synth" field displays "ptwrite" information like this: "ip: 0 payload: 0x123456789abcdef0" where "ip" is 1 if "fup_on_ptw" was used.

"Power" events correspond to power event packets and CBR (core-to-bus ratio) packets. While CBR packets are always recorded when tracing is enabled, power event packets are recorded only if the "pwr_evt" config term was used. Refer to the config terms section above. The power events record information about C-state changes, whereas CBR is indicative of CPU frequency. perf script "event,synth" fields display information like this: cbr: cbr: 22 freq: 2189 MHz (200%) mwait: hints: 0x60 extensions: 0x1 pwre: hw: 0 cstate: 2 sub-cstate: 0 exstop: ip: 1 pwrx: deepest cstate: 2 last cstate: 2 wake reason: 0x4 Where: "cbr" includes the frequency and the percentage of maximum non-turbo "mwait" shows mwait hints and extensions "pwre" shows C-state transitions (to a C-state deeper than C0) and whether initiated by hardware "exstop" indicates execution stopped and whether the IP was recorded exactly, "pwrx" indicates return to C0 For more details refer to the Intel 64 and IA-32 Architectures Software Developer Manuals.

PSB events show when a PSB+ occurred and also the byte-offset in the trace. Emitting a PSB+ can cause a CPU a slight delay. When doing timing analysis of code with Intel PT, it is useful to know if a timing bubble was caused by Intel PT or not.

Error events show where the decoder lost the trace. Error events are quite important. Users must know if what they are seeing is a complete picture or not. The "e" option may be followed by flags which affect what errors will or will not be reported. Each flag must be preceded by either + or -. The flags supported by Intel PT are: -o Suppress overflow errors -l Suppress trace data lost errors For example, for errors but not overflow or data lost errors:

--itrace=e-o-l

The "d" option will cause the creation of a file "intel_pt.log" containing all decoded packets and instructions. Note that this option slows down the decoder and that the resulting file may be very large. The "d" option may be followed by flags which affect what debug messages will or will not be logged. Each flag must be preceded by either + or -. The flags support by Intel PT are: -a Suppress logging of perf events +a Log all perf events By default, logged perf events are filtered by any specified time ranges, but flag +a overrides that.

In addition, the period of the "instructions" event can be specified. e.g.

--itrace=i10us

sets the period to 10us i.e. one instruction sample is synthesized for each 10 microseconds of trace. Alternatives to "us" are "ms" (milliseconds), "ns" (nanoseconds), "t" (TSC ticks) or "i" (instructions).

"ms", "us" and "ns" are converted to TSC ticks.

The timing information included with Intel PT does not give the time of every instruction. Consequently, for the purpose of sampling, the decoder estimates the time since the last timing packet based on 1 tick per instruction. The time on the sample is not adjusted and reflects the last known value of TSC.

For Intel PT, the default period is 100us.

Setting it to a zero period means "as often as possible".

In the case of Intel PT that is the same as a period of 1 and a unit of instructions (i.e. --itrace=i1i).

Also the call chain size (default 16, max. 1024) for instructions or transactions events can be specified. e.g.

--itrace=ig32 --itrace=xg32

Also the number of last branch entries (default 64, max. 1024) for instructions or transactions events can be specified. e.g.

--itrace=il10 --itrace=xl10

Note that last branch entries are cleared for each sample, so there is no overlap from one sample to the next.

The G and L options are designed in particular for sample mode, and work much like g and l but add call chain and branch stack to the other selected events instead of synthesized events. For example, to record branch-misses events for ls and then add a call chain derived from the Intel PT trace:

perf record --aux-sample -e '{intel_pt//u,branch-misses:u}' -- ls perf report --itrace=Ge

Although in fact G is a default for perf report, so that is the same as just:

perf report

One caveat with the G and L options is that they work poorly with "Large PEBS". Large PEBS means PEBS records will be accumulated by hardware and the written into the event buffer in one go. That reduces interrupts, but can give very late timestamps. Because the Intel PT trace is synchronized by timestamps, the PEBS events do not match the trace. Currently, Large PEBS is used only in certain circumstances: - hardware supports it - PEBS is used - event period is specified, instead of frequency - the sample type is limited to the following flags: PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR | PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID | PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER | PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR | PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER | PERF_SAMPLE_PERIOD (and sometimes) | PERF_SAMPLE_TIME Because Intel PT sample mode uses a different sample type to the list above, Large PEBS is not used with Intel PT sample mode. To avoid Large PEBS in other cases, avoid specifying the event period i.e. avoid the perf record -c option, --count option, or period config term.

To disable trace decoding entirely, use the option --no-itrace.

It is also possible to skip events generated (instructions, branches, transactions) at the beginning. This is useful to ignore initialization code.

--itrace=i0nss1000000

skips the first million instructions.

The q option changes the way the trace is decoded. The decoding is much faster but much less detailed. Specifically, with the q option, the decoder does not decode TNT packets, and does not walk object code, but gets the ip from FUP and TIP packets. The q option can be used with the b and i options but the period is not used. The q option decodes more quickly, but is useful only if the control flow of interest is represented or indicated by FUP, TIP, TIP.PGE, or TIP.PGD packets (refer below). However the q option could be used to find time ranges that could then be decoded fully using the --time option.

What will not be decoded with the (single) q option:

• direct calls and jmps

• conditional branches

• non-branch instructions

What will be decoded with the (single) q option:

• asynchronous branches such as interrupts

• indirect branches

• function return target address if the noretcomp config term (refer config terms section) was used

• start of (control-flow) tracing

• end of (control-flow) tracing, if it is not out of context

• power events, ptwrite, transaction start and abort

• instruction pointer associated with PSB packets

Note the q option does not specify what events will be synthesized e.g. the p option must be used also to show power events.

Repeating the q option (double-q i.e. qq) results in even faster decoding and even less detail. The decoder decodes only extended PSB (PSB+) packets, getting the instruction pointer if there is a FUP packet within PSB+ (i.e. between PSB and PSBEND). Note PSB packets occur regularly in the trace based on the psb_period config term (refer config terms section). There will be a FUP packet if the PSB+ occurs while control flow is being traced.

What will not be decoded with the qq option:

• everything except instruction pointer associated with PSB packets

What will be decoded with the qq option:

• instruction pointer associated with PSB packets

The Z option is equivalent to having recorded a trace without TSC (i.e. config term tsc=0). It can be useful to avoid timestamp issues when decoding a trace of a virtual machine.

dump option perf script has an option (-D) to "dump" the events i.e. display the binary data.

When -D is used, Intel PT packets are displayed. The packet decoder does not pay attention to PSB packets, but just decodes the bytes - so the packets seen by the actual decoder may not be identical in places where the data is corrupt. One example of that would be when the buffer-switching interrupt has been too slow, and the buffer has been filled completely. In that case, the last packet in the buffer might be truncated and immediately followed by a PSB as the trace continues in the next buffer.

To disable the display of Intel PT packets, combine the -D option with --no-itrace.

Исходный текст на man7.org

perf-intel-pt ( 1 )

PERF SCRIPT

`perf-intel-pt` ( 1 )