--callgrind-out-file=<file>
Write the profile data to file rather than to the default
output file, callgrind.out.<pid>. The %p
and %q
format
specifiers can be used to embed the process ID and/or the
contents of an environment variable in the name, as is the
case for the core option --log-file
. When multiple dumps are
made, the file name is modified further; see below.
--dump-line=<no|yes> [default: yes]
This specifies that event counting should be performed at
source line granularity. This allows source annotation for
sources which are compiled with debug information (-g
).
--dump-instr=<no|yes> [default: no]
This specifies that event counting should be performed at
per-instruction granularity. This allows for assembly code
annotation. Currently the results can only be displayed by
KCachegrind.
--compress-strings=<no|yes> [default: yes]
This option influences the output format of the profile data.
It specifies whether strings (file and function names) should
be identified by numbers. This shrinks the file, but makes it
more difficult for humans to read (which is not recommended
in any case).
--compress-pos=<no|yes> [default: yes]
This option influences the output format of the profile data.
It specifies whether numerical positions are always specified
as absolute values or are allowed to be relative to previous
numbers. This shrinks the file size.
--combine-dumps=<no|yes> [default: no]
When enabled, when multiple profile data parts are to be
generated these parts are appended to the same output file.
Not recommended.
--dump-every-bb=<count> [default: 0, never]
Dump profile data every count
basic blocks. Whether a dump is
needed is only checked when Valgrind's internal scheduler is
run. Therefore, the minimum setting useful is about 100000.
The count is a 64-bit value to make long dump periods
possible.
--dump-before=<function>
Dump when entering function
.
--zero-before=<function>
Zero all costs when entering function
.
--dump-after=<function>
Dump when leaving function
.
--instr-atstart=<yes|no> [default: yes]
Specify if you want Callgrind to start simulation and
profiling from the beginning of the program. When set to no,
Callgrind will not be able to collect any information,
including calls, but it will have at most a slowdown of
around 4, which is the minimum Valgrind overhead.
Instrumentation can be interactively enabled via
callgrind_control -i on.
Note that the resulting call graph will most probably not
contain main
, but will contain all the functions executed
after instrumentation was enabled. Instrumentation can also
be programmatically enabled/disabled. See the Callgrind
include file callgrind.h for the macro you have to use in
your source code.
For cache simulation, results will be less accurate when
switching on instrumentation later in the program run, as the
simulator starts with an empty cache at that moment. Switch
on event collection later to cope with this error.
--collect-atstart=<yes|no> [default: yes]
Specify whether event collection is enabled at beginning of
the profile run.
To only look at parts of your program, you have two
possibilities:
1. Zero event counters before entering the program part you
want to profile, and dump the event counters to a file
after leaving that program part.
2. Switch on/off collection state as needed to only see
event counters happening while inside of the program part
you want to profile.
The second option can be used if the program part you want to
profile is called many times. Option 1, i.e. creating a lot
of dumps is not practical here.
Collection state can be toggled at entry and exit of a given
function with the option --toggle-collect
. If you use this
option, collection state should be disabled at the beginning.
Note that the specification of --toggle-collect
implicitly
sets --collect-state=no
.
Collection state can be toggled also by inserting the client
request CALLGRIND_TOGGLE_COLLECT ; at the needed code
positions.
--toggle-collect=<function>
Toggle collection on entry/exit of function
.
--collect-jumps=<no|yes> [default: no]
This specifies whether information for (conditional) jumps
should be collected. As above, callgrind_annotate currently
is not able to show you the data. You have to use KCachegrind
to get jump arrows in the annotated code.
--collect-systime=<no|yes|msec|usec|nsec> [default: no]
This specifies whether information for system call times
should be collected.
The value no indicates to record no system call information.
The other values indicate to record the number of system
calls done (sysCount event) and the elapsed time (sysTime
event) spent in system calls. The --collect-systime value
gives the unit used for sysTime : milli seconds, micro
seconds or nano seconds. With the value nsec, callgrind also
records the cpu time spent during system calls (sysCpuTime).
The value yes is a synonym of msec. The value nsec is not
supported on Darwin.
--collect-bus=<no|yes> [default: no]
This specifies whether the number of global bus events
executed should be collected. The event type "Ge" is used for
these events.
--cache-sim=<yes|no> [default: no]
Specify if you want to do full cache simulation. By default,
only instruction read accesses will be counted ("Ir"). With
cache simulation, further event counters are enabled: Cache
misses on instruction reads ("I1mr"/"ILmr"), data read
accesses ("Dr") and related cache misses ("D1mr"/"DLmr"),
data write accesses ("Dw") and related cache misses
("D1mw"/"DLmw"). For more information, see Cachegrind: a
cache and branch-prediction profiler.
--branch-sim=<yes|no> [default: no]
Specify if you want to do branch prediction simulation.
Further event counters are enabled: Number of executed
conditional branches and related predictor misses
("Bc"/"Bcm"), executed indirect jumps and related misses of
the jump address predictor ("Bi"/"Bim").