компилятор C и C ++ проекта GNU (GNU project C and C++ compiler)
Параметры подробно (Options detail)
Control Optimization - 1
These options control various sorts of optimizations.
Without any optimization option, the compiler's goal is to reduce
the cost of compilation and to make debugging produce the
expected results. Statements are independent: if you stop the
program with a breakpoint between statements, you can then assign
a new value to any variable or change the program counter to any
other statement in the function and get exactly the results you
expect from the source code.
Turning on optimization flags makes the compiler attempt to
improve the performance and/or code size at the expense of
compilation time and possibly the ability to debug the program.
The compiler performs optimization based on the knowledge it has
of the program. Compiling multiple files at once to a single
output file mode allows the compiler to use information gained
from all of the files when compiling each of them.
Not all optimizations are controlled directly by a flag. Only
optimizations that have a flag are listed in this section.
Most optimizations are completely disabled at -O0
or if an -O
level is not set on the command line, even if individual
optimization flags are specified. Similarly, -Og
suppresses many
optimization passes.
Depending on the target and how GCC was configured, a slightly
different set of optimizations may be enabled at each -O
level
than those listed here. You can invoke GCC with -Q
--help=optimizers
to find out the exact set of optimizations that
are enabled at each level.
-O
-O1
Optimize. Optimizing compilation takes somewhat more time,
and a lot more memory for a large function.
With -O
, the compiler tries to reduce code size and execution
time, without performing any optimizations that take a great
deal of compilation time.
-O
turns on the following optimization flags:
-fauto-inc-dec -fbranch-count-reg -fcombine-stack-adjustments
-fcompare-elim -fcprop-registers -fdce -fdefer-pop
-fdelayed-branch -fdse -fforward-propagate
-fguess-branch-probability -fif-conversion -fif-conversion2
-finline-functions-called-once -fipa-profile -fipa-pure-const
-fipa-reference -fipa-reference-addressable -fmerge-constants
-fmove-loop-invariants -fomit-frame-pointer -freorder-blocks
-fshrink-wrap -fshrink-wrap-separate -fsplit-wide-types
-fssa-backprop -fssa-phiopt -ftree-bit-ccp -ftree-ccp
-ftree-ch -ftree-coalesce-vars -ftree-copy-prop -ftree-dce
-ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre
-ftree-phiprop -ftree-pta -ftree-scev-cprop -ftree-sink
-ftree-slsr -ftree-sra -ftree-ter -funit-at-a-time
-O2
Optimize even more. GCC performs nearly all supported
optimizations that do not involve a space-speed tradeoff. As
compared to -O
, this option increases both compilation time
and the performance of the generated code.
-O2
turns on all optimization flags specified by -O
. It also
turns on the following optimization flags:
-falign-functions -falign-jumps -falign-labels
-falign-loops -fcaller-saves -fcode-hoisting -fcrossjumping
-fcse-follow-jumps -fcse-skip-blocks
-fdelete-null-pointer-checks -fdevirtualize
-fdevirtualize-speculatively -fexpensive-optimizations -fgcse
-fgcse-lm -fhoist-adjacent-loads -finline-small-functions
-findirect-inlining -fipa-bit-cp -fipa-cp -fipa-icf
-fipa-ra -fipa-sra -fipa-vrp
-fisolate-erroneous-paths-dereference -flra-remat
-foptimize-sibling-calls -foptimize-strlen -fpartial-inlining
-fpeephole2 -freorder-blocks-algorithm=stc
-freorder-blocks-and-partition -freorder-functions
-frerun-cse-after-loop -fschedule-insns -fschedule-insns2
-fsched-interblock -fsched-spec -fstore-merging
-fstrict-aliasing -fthread-jumps -ftree-builtin-call-dce
-ftree-pre -ftree-switch-conversion -ftree-tail-merge
-ftree-vrp
Please note the warning under -fgcse
about invoking -O2
on
programs that use computed gotos.
-O3
Optimize yet more. -O3
turns on all optimizations specified
by -O2
and also turns on the following optimization flags:
-fgcse-after-reload -finline-functions -fipa-cp-clone
-floop-interchange -floop-unroll-and-jam -fpeel-loops
-fpredictive-commoning -fsplit-paths
-ftree-loop-distribute-patterns -ftree-loop-distribution
-ftree-loop-vectorize -ftree-partial-pre -ftree-slp-vectorize
-funswitch-loops -fvect-cost-model
-fversion-loops-for-strides
-O0
Reduce compilation time and make debugging produce the
expected results. This is the default.
-Os
Optimize for size. -Os
enables all -O2
optimizations except
those that often increase code size:
-falign-functions -falign-jumps -falign-labels
-falign-loops -fprefetch-loop-arrays
-freorder-blocks-algorithm=stc
It also enables -finline-functions
, causes the compiler to
tune for code size rather than execution speed, and performs
further optimizations designed to reduce code size.
-Ofast
Disregard strict standards compliance. -Ofast
enables all
-O3
optimizations. It also enables optimizations that are
not valid for all standard-compliant programs. It turns on
-ffast-math
and the Fortran-specific -fstack-arrays
, unless
-fmax-stack-var-size
is specified, and -fno-protect-parens
.
-Og
Optimize debugging experience. -Og
should be the
optimization level of choice for the standard edit-compile-
debug cycle, offering a reasonable level of optimization
while maintaining fast compilation and a good debugging
experience. It is a better choice than -O0
for producing
debuggable code because some compiler passes that collect
debug information are disabled at -O0
.
Like -O0
, -Og
completely disables a number of optimization
passes so that individual options controlling them have no
effect. Otherwise -Og
enables all -O1
optimization flags
except for those that may interfere with debugging:
-fbranch-count-reg -fdelayed-branch -fif-conversion
-fif-conversion2 -finline-functions-called-once
-fmove-loop-invariants -fssa-phiopt -ftree-bit-ccp
-ftree-pta -ftree-sra
If you use multiple -O
options, with or without level numbers,
the last such option is the one that is effective.
Options of the form -f
flag specify machine-independent flags.
Most flags have both positive and negative forms; the negative
form of -ffoo
is -fno-foo
. In the table below, only one of the
forms is listed---the one you typically use. You can figure out
the other form by either removing no-
or adding it.
The following options control specific optimizations. They are
either activated by -O
options or are related to ones that are.
You can use the following flags in the rare cases when "fine-
tuning" of optimizations to be performed is desired.
-fno-defer-pop
For machines that must pop arguments after a function call,
always pop the arguments as soon as each function returns.
At levels -O1
and higher, -fdefer-pop
is the default; this
allows the compiler to let arguments accumulate on the stack
for several function calls and pop them all at once.
-fforward-propagate
Perform a forward propagation pass on RTL. The pass tries to
combine two instructions and checks if the result can be
simplified. If loop unrolling is active, two passes are
performed and the second is scheduled after loop unrolling.
This option is enabled by default at optimization levels -O
,
-O2
, -O3
, -Os
.
-ffp-contract=
style
-ffp-contract=off
disables floating-point expression
contraction. -ffp-contract=fast
enables floating-point
expression contraction such as forming of fused multiply-add
operations if the target has native support for them.
-ffp-contract=on
enables floating-point expression
contraction if allowed by the language standard. This is
currently not implemented and treated equal to
-ffp-contract=off
.
The default is -ffp-contract=fast
.
-fomit-frame-pointer
Omit the frame pointer in functions that don't need one.
This avoids the instructions to save, set up and restore the
frame pointer; on many targets it also makes an extra
register available.
On some targets this flag has no effect because the standard
calling sequence always uses a frame pointer, so it cannot be
omitted.
Note that -fno-omit-frame-pointer
doesn't guarantee the frame
pointer is used in all functions. Several targets always
omit the frame pointer in leaf functions.
Enabled by default at -O
and higher.
-foptimize-sibling-calls
Optimize sibling and tail recursive calls.
Enabled at levels -O2
, -O3
, -Os
.
-foptimize-strlen
Optimize various standard C string functions (e.g. "strlen",
"strchr" or "strcpy") and their "_FORTIFY_SOURCE"
counterparts into faster alternatives.
Enabled at levels -O2
, -O3
.
-fno-inline
Do not expand any functions inline apart from those marked
with the "always_inline" attribute. This is the default when
not optimizing.
Single functions can be exempted from inlining by marking
them with the "noinline" attribute.
-finline-small-functions
Integrate functions into their callers when their body is
smaller than expected function call code (so overall size of
program gets smaller). The compiler heuristically decides
which functions are simple enough to be worth integrating in
this way. This inlining applies to all functions, even those
not declared inline.
Enabled at levels -O2
, -O3
, -Os
.
-findirect-inlining
Inline also indirect calls that are discovered to be known at
compile time thanks to previous inlining. This option has
any effect only when inlining itself is turned on by the
-finline-functions
or -finline-small-functions
options.
Enabled at levels -O2
, -O3
, -Os
.
-finline-functions
Consider all functions for inlining, even if they are not
declared inline. The compiler heuristically decides which
functions are worth integrating in this way.
If all calls to a given function are integrated, and the
function is declared "static", then the function is normally
not output as assembler code in its own right.
Enabled at levels -O3
, -Os
. Also enabled by -fprofile-use
and -fauto-profile
.
-finline-functions-called-once
Consider all "static" functions called once for inlining into
their caller even if they are not marked "inline". If a call
to a given function is integrated, then the function is not
output as assembler code in its own right.
Enabled at levels -O1
, -O2
, -O3
and -Os
, but not -Og
.
-fearly-inlining
Inline functions marked by "always_inline" and functions
whose body seems smaller than the function call overhead
early before doing -fprofile-generate
instrumentation and
real inlining pass. Doing so makes profiling significantly
cheaper and usually inlining faster on programs having large
chains of nested wrapper functions.
Enabled by default.
-fipa-sra
Perform interprocedural scalar replacement of aggregates,
removal of unused parameters and replacement of parameters
passed by reference by parameters passed by value.
Enabled at levels -O2
, -O3
and -Os
.
-finline-limit=
n
By default, GCC limits the size of functions that can be
inlined. This flag allows coarse control of this limit. n
is the size of functions that can be inlined in number of
pseudo instructions.
Inlining is actually controlled by a number of parameters,
which may be specified individually by using --param
name=
value. The -finline-limit=
n option sets some of these
parameters as follows:
max-inline-insns-single
is set to n/2.
max-inline-insns-auto
is set to n/2.
See below for a documentation of the individual parameters
controlling inlining and for the defaults of these
parameters.
Note: there may be no value to -finline-limit
that results in
default behavior.
Note: pseudo instruction represents, in this particular
context, an abstract measurement of function's size. In no
way does it represent a count of assembly instructions and as
such its exact meaning might change from one release to an
another.
-fno-keep-inline-dllexport
This is a more fine-grained version of
-fkeep-inline-functions
, which applies only to functions that
are declared using the "dllexport" attribute or declspec.
-fkeep-inline-functions
In C, emit "static" functions that are declared "inline" into
the object file, even if the function has been inlined into
all of its callers. This switch does not affect functions
using the "extern inline" extension in GNU C90. In C++, emit
any and all inline functions into the object file.
-fkeep-static-functions
Emit "static" functions into the object file, even if the
function is never used.
-fkeep-static-consts
Emit variables declared "static const" when optimization
isn't turned on, even if the variables aren't referenced.
GCC enables this option by default. If you want to force the
compiler to check if a variable is referenced, regardless of
whether or not optimization is turned on, use the
-fno-keep-static-consts
option.
-fmerge-constants
Attempt to merge identical constants (string constants and
floating-point constants) across compilation units.
This option is the default for optimized compilation if the
assembler and linker support it. Use -fno-merge-constants
to
inhibit this behavior.
Enabled at levels -O
, -O2
, -O3
, -Os
.
-fmerge-all-constants
Attempt to merge identical constants and identical variables.
This option implies -fmerge-constants
. In addition to
-fmerge-constants
this considers e.g. even constant
initialized arrays or initialized constant variables with
integral or floating-point types. Languages like C or C++
require each variable, including multiple instances of the
same variable in recursive calls, to have distinct locations,
so using this option results in non-conforming behavior.
-fmodulo-sched
Perform swing modulo scheduling immediately before the first
scheduling pass. This pass looks at innermost loops and
reorders their instructions by overlapping different
iterations.
-fmodulo-sched-allow-regmoves
Perform more aggressive SMS-based modulo scheduling with
register moves allowed. By setting this flag certain anti-
dependences edges are deleted, which triggers the generation
of reg-moves based on the life-range analysis. This option
is effective only with -fmodulo-sched
enabled.
-fno-branch-count-reg
Disable the optimization pass that scans for opportunities to
use "decrement and branch" instructions on a count register
instead of instruction sequences that decrement a register,
compare it against zero, and then branch based upon the
result. This option is only meaningful on architectures that
support such instructions, which include x86, PowerPC, IA-64
and S/390. Note that the -fno-branch-count-reg
option
doesn't remove the decrement and branch instructions from the
generated instruction stream introduced by other optimization
passes.
The default is -fbranch-count-reg
at -O1
and higher, except
for -Og
.
-fno-function-cse
Do not put function addresses in registers; make each
instruction that calls a constant function contain the
function's address explicitly.
This option results in less efficient code, but some strange
hacks that alter the assembler output may be confused by the
optimizations performed when this option is not used.
The default is -ffunction-cse
-fno-zero-initialized-in-bss
If the target supports a BSS section, GCC by default puts
variables that are initialized to zero into BSS. This can
save space in the resulting code.
This option turns off this behavior because some programs
explicitly rely on variables going to the data
section---e.g., so that the resulting executable can find the
beginning of that section and/or make assumptions based on
that.
The default is -fzero-initialized-in-bss
.
-fthread-jumps
Perform optimizations that check to see if a jump branches to
a location where another comparison subsumed by the first is
found. If so, the first branch is redirected to either the
destination of the second branch or a point immediately
following it, depending on whether the condition is known to
be true or false.
Enabled at levels -O2
, -O3
, -Os
.
-fsplit-wide-types
When using a type that occupies multiple registers, such as
"long long" on a 32-bit system, split the registers apart and
allocate them independently. This normally generates better
code for those types, but may make debugging more difficult.
Enabled at levels -O
, -O2
, -O3
, -Os
.
-fcse-follow-jumps
In common subexpression elimination (CSE), scan through jump
instructions when the target of the jump is not reached by
any other path. For example, when CSE encounters an "if"
statement with an "else" clause, CSE follows the jump when
the condition tested is false.
Enabled at levels -O2
, -O3
, -Os
.
-fcse-skip-blocks
This is similar to -fcse-follow-jumps
, but causes CSE to
follow jumps that conditionally skip over blocks. When CSE
encounters a simple "if" statement with no else clause,
-fcse-skip-blocks
causes CSE to follow the jump around the
body of the "if".
Enabled at levels -O2
, -O3
, -Os
.
-frerun-cse-after-loop
Re-run common subexpression elimination after loop
optimizations are performed.
Enabled at levels -O2
, -O3
, -Os
.
-fgcse
Perform a global common subexpression elimination pass. This
pass also performs global constant and copy propagation.
Note: When compiling a program using computed gotos, a GCC
extension, you may get better run-time performance if you
disable the global common subexpression elimination pass by
adding -fno-gcse
to the command line.
Enabled at levels -O2
, -O3
, -Os
.
-fgcse-lm
When -fgcse-lm
is enabled, global common subexpression
elimination attempts to move loads that are only killed by
stores into themselves. This allows a loop containing a
load/store sequence to be changed to a load outside the loop,
and a copy/store within the loop.
Enabled by default when -fgcse
is enabled.
-fgcse-sm
When -fgcse-sm
is enabled, a store motion pass is run after
global common subexpression elimination. This pass attempts
to move stores out of loops. When used in conjunction with
-fgcse-lm
, loops containing a load/store sequence can be
changed to a load before the loop and a store after the loop.
Not enabled at any optimization level.
-fgcse-las
When -fgcse-las
is enabled, the global common subexpression
elimination pass eliminates redundant loads that come after
stores to the same memory location (both partial and full
redundancies).
Not enabled at any optimization level.
-fgcse-after-reload
When -fgcse-after-reload
is enabled, a redundant load
elimination pass is performed after reload. The purpose of
this pass is to clean up redundant spilling.
Enabled by -fprofile-use
and -fauto-profile
.
-faggressive-loop-optimizations
This option tells the loop optimizer to use language
constraints to derive bounds for the number of iterations of
a loop. This assumes that loop code does not invoke
undefined behavior by for example causing signed integer
overflows or out-of-bound array accesses. The bounds for the
number of iterations of a loop are used to guide loop
unrolling and peeling and loop exit test optimizations. This
option is enabled by default.
-funconstrained-commons
This option tells the compiler that variables declared in
common blocks (e.g. Fortran) may later be overridden with
longer trailing arrays. This prevents certain optimizations
that depend on knowing the array bounds.
-fcrossjumping
Perform cross-jumping transformation. This transformation
unifies equivalent code and saves code size. The resulting
code may or may not perform better than without cross-
jumping.
Enabled at levels -O2
, -O3
, -Os
.
-fauto-inc-dec
Combine increments or decrements of addresses with memory
accesses. This pass is always skipped on architectures that
do not have instructions to support this. Enabled by default
at -O
and higher on architectures that support this.
-fdce
Perform dead code elimination (DCE) on RTL. Enabled by
default at -O
and higher.
-fdse
Perform dead store elimination (DSE) on RTL. Enabled by
default at -O
and higher.
-fif-conversion
Attempt to transform conditional jumps into branch-less
equivalents. This includes use of conditional moves, min,
max, set flags and abs instructions, and some tricks doable
by standard arithmetics. The use of conditional execution on
chips where it is available is controlled by
-fif-conversion2
.
Enabled at levels -O
, -O2
, -O3
, -Os
, but not with -Og
.
-fif-conversion2
Use conditional execution (where available) to transform
conditional jumps into branch-less equivalents.
Enabled at levels -O
, -O2
, -O3
, -Os
, but not with -Og
.
-fdeclone-ctor-dtor
The C++ ABI requires multiple entry points for constructors
and destructors: one for a base subobject, one for a complete
object, and one for a virtual destructor that calls operator
delete afterwards. For a hierarchy with virtual bases, the
base and complete variants are clones, which means two copies
of the function. With this option, the base and complete
variants are changed to be thunks that call a common
implementation.
Enabled by -Os
.
-fdelete-null-pointer-checks
Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
This option enables simple constant folding optimizations at
all optimization levels. In addition, other optimization
passes in GCC use this flag to control global dataflow
analyses that eliminate useless checks for null pointers;
these assume that a memory access to address zero always
results in a trap, so that if a pointer is checked after it
has already been dereferenced, it cannot be null.
Note however that in some environments this assumption is not
true. Use -fno-delete-null-pointer-checks
to disable this
optimization for programs that depend on that behavior.
This option is enabled by default on most targets. On Nios
II ELF, it defaults to off. On AVR, CR16, and MSP430, this
option is completely disabled.
Passes that use the dataflow information are enabled
independently at different optimization levels.
-fdevirtualize
Attempt to convert calls to virtual functions to direct
calls. This is done both within a procedure and
interprocedurally as part of indirect inlining
(-findirect-inlining
) and interprocedural constant
propagation (-fipa-cp
). Enabled at levels -O2
, -O3
, -Os
.
-fdevirtualize-speculatively
Attempt to convert calls to virtual functions to speculative
direct calls. Based on the analysis of the type inheritance
graph, determine for a given call the set of likely targets.
If the set is small, preferably of size 1, change the call
into a conditional deciding between direct and indirect
calls. The speculative calls enable more optimizations, such
as inlining. When they seem useless after further
optimization, they are converted back into original form.
-fdevirtualize-at-ltrans
Stream extra information needed for aggressive
devirtualization when running the link-time optimizer in
local transformation mode. This option enables more
devirtualization but significantly increases the size of
streamed data. For this reason it is disabled by default.
-fexpensive-optimizations
Perform a number of minor optimizations that are relatively
expensive.
Enabled at levels -O2
, -O3
, -Os
.
-free
Attempt to remove redundant extension instructions. This is
especially helpful for the x86-64 architecture, which
implicitly zero-extends in 64-bit registers after writing to
their lower 32-bit half.
Enabled for Alpha, AArch64 and x86 at levels -O2
, -O3
, -Os
.
-fno-lifetime-dse
In C++ the value of an object is only affected by changes
within its lifetime: when the constructor begins, the object
has an indeterminate value, and any changes during the
lifetime of the object are dead when the object is destroyed.
Normally dead store elimination will take advantage of this;
if your code relies on the value of the object storage
persisting beyond the lifetime of the object, you can use
this flag to disable this optimization. To preserve stores
before the constructor starts (e.g. because your operator new
clears the object storage) but still treat the object as dead
after the destructor you, can use -flifetime-dse=1
. The
default behavior can be explicitly selected with
-flifetime-dse=2
. -flifetime-dse=0
is equivalent to
-fno-lifetime-dse
.
-flive-range-shrinkage
Attempt to decrease register pressure through register live
range shrinkage. This is helpful for fast processors with
small or moderate size register sets.
-fira-algorithm=
algorithm
Use the specified coloring algorithm for the integrated
register allocator. The algorithm argument can be priority
,
which specifies Chow's priority coloring, or CB
, which
specifies Chaitin-Briggs coloring. Chaitin-Briggs coloring
is not implemented for all architectures, but for those
targets that do support it, it is the default because it
generates better code.
-fira-region=
region
Use specified regions for the integrated register allocator.
The region argument should be one of the following:
all
Use all loops as register allocation regions. This can
give the best results for machines with a small and/or
irregular register set.
mixed
Use all loops except for loops with small register
pressure as the regions. This value usually gives the
best results in most cases and for most architectures,
and is enabled by default when compiling with
optimization for speed (-O
, -O2
, ...).
one
Use all functions as a single region. This typically
results in the smallest code size, and is enabled by
default for -Os
or -O0
.
-fira-hoist-pressure
Use IRA to evaluate register pressure in the code hoisting
pass for decisions to hoist expressions. This option usually
results in smaller code, but it can slow the compiler down.
This option is enabled at level -Os
for all targets.
-fira-loop-pressure
Use IRA to evaluate register pressure in loops for decisions
to move loop invariants. This option usually results in
generation of faster and smaller code on machines with large
register files (>= 32 registers), but it can slow the
compiler down.
This option is enabled at level -O3
for some targets.
-fno-ira-share-save-slots
Disable sharing of stack slots used for saving call-used hard
registers living through a call. Each hard register gets a
separate stack slot, and as a result function stack frames
are larger.
-fno-ira-share-spill-slots
Disable sharing of stack slots allocated for pseudo-
registers. Each pseudo-register that does not get a hard
register gets a separate stack slot, and as a result function
stack frames are larger.
-flra-remat
Enable CFG-sensitive rematerialization in LRA. Instead of
loading values of spilled pseudos, LRA tries to rematerialize
(recalculate) values if it is profitable.
Enabled at levels -O2
, -O3
, -Os
.
-fdelayed-branch
If supported for the target machine, attempt to reorder
instructions to exploit instruction slots available after
delayed branch instructions.
Enabled at levels -O
, -O2
, -O3
, -Os
, but not at -Og
.
-fschedule-insns
If supported for the target machine, attempt to reorder
instructions to eliminate execution stalls due to required
data being unavailable. This helps machines that have slow
floating point or memory load instructions by allowing other
instructions to be issued until the result of the load or
floating-point instruction is required.
Enabled at levels -O2
, -O3
.
-fschedule-insns2
Similar to -fschedule-insns
, but requests an additional pass
of instruction scheduling after register allocation has been
done. This is especially useful on machines with a
relatively small number of registers and where memory load
instructions take more than one cycle.
Enabled at levels -O2
, -O3
, -Os
.
-fno-sched-interblock
Disable instruction scheduling across basic blocks, which is
normally enabled when scheduling before register allocation,
i.e. with -fschedule-insns
or at -O2
or higher.
-fno-sched-spec
Disable speculative motion of non-load instructions, which is
normally enabled when scheduling before register allocation,
i.e. with -fschedule-insns
or at -O2
or higher.
-fsched-pressure
Enable register pressure sensitive insn scheduling before
register allocation. This only makes sense when scheduling
before register allocation is enabled, i.e. with
-fschedule-insns
or at -O2
or higher. Usage of this option
can improve the generated code and decrease its size by
preventing register pressure increase above the number of
available hard registers and subsequent spills in register
allocation.
-fsched-spec-load
Allow speculative motion of some load instructions. This
only makes sense when scheduling before register allocation,
i.e. with -fschedule-insns
or at -O2
or higher.
-fsched-spec-load-dangerous
Allow speculative motion of more load instructions. This
only makes sense when scheduling before register allocation,
i.e. with -fschedule-insns
or at -O2
or higher.
-fsched-stalled-insns
-fsched-stalled-insns=
n
Define how many insns (if any) can be moved prematurely from
the queue of stalled insns into the ready list during the
second scheduling pass. -fno-sched-stalled-insns
means that
no insns are moved prematurely, -fsched-stalled-insns=0
means
there is no limit on how many queued insns can be moved
prematurely. -fsched-stalled-insns
without a value is
equivalent to -fsched-stalled-insns=1
.
-fsched-stalled-insns-dep
-fsched-stalled-insns-dep=
n
Define how many insn groups (cycles) are examined for a
dependency on a stalled insn that is a candidate for
premature removal from the queue of stalled insns. This has
an effect only during the second scheduling pass, and only if
-fsched-stalled-insns
is used. -fno-sched-stalled-insns-dep
is equivalent to -fsched-stalled-insns-dep=0
.
-fsched-stalled-insns-dep
without a value is equivalent to
-fsched-stalled-insns-dep=1
.
-fsched2-use-superblocks
When scheduling after register allocation, use superblock
scheduling. This allows motion across basic block
boundaries, resulting in faster schedules. This option is
experimental, as not all machine descriptions used by GCC
model the CPU closely enough to avoid unreliable results from
the algorithm.
This only makes sense when scheduling after register
allocation, i.e. with -fschedule-insns2
or at -O2
or higher.
-fsched-group-heuristic
Enable the group heuristic in the scheduler. This heuristic
favors the instruction that belongs to a schedule group.
This is enabled by default when scheduling is enabled, i.e.
with -fschedule-insns
or -fschedule-insns2
or at -O2
or
higher.
-fsched-critical-path-heuristic
Enable the critical-path heuristic in the scheduler. This
heuristic favors instructions on the critical path. This is
enabled by default when scheduling is enabled, i.e. with
-fschedule-insns
or -fschedule-insns2
or at -O2
or higher.
-fsched-spec-insn-heuristic
Enable the speculative instruction heuristic in the
scheduler. This heuristic favors speculative instructions
with greater dependency weakness. This is enabled by default
when scheduling is enabled, i.e. with -fschedule-insns
or
-fschedule-insns2
or at -O2
or higher.
-fsched-rank-heuristic
Enable the rank heuristic in the scheduler. This heuristic
favors the instruction belonging to a basic block with
greater size or frequency. This is enabled by default when
scheduling is enabled, i.e. with -fschedule-insns
or
-fschedule-insns2
or at -O2
or higher.
-fsched-last-insn-heuristic
Enable the last-instruction heuristic in the scheduler. This
heuristic favors the instruction that is less dependent on
the last instruction scheduled. This is enabled by default
when scheduling is enabled, i.e. with -fschedule-insns
or
-fschedule-insns2
or at -O2
or higher.
-fsched-dep-count-heuristic
Enable the dependent-count heuristic in the scheduler. This
heuristic favors the instruction that has more instructions
depending on it. This is enabled by default when scheduling
is enabled, i.e. with -fschedule-insns
or -fschedule-insns2
or at -O2
or higher.
-freschedule-modulo-scheduled-loops
Modulo scheduling is performed before traditional scheduling.
If a loop is modulo scheduled, later scheduling passes may
change its schedule. Use this option to control that
behavior.
-fselective-scheduling
Schedule instructions using selective scheduling algorithm.
Selective scheduling runs instead of the first scheduler
pass.
-fselective-scheduling2
Schedule instructions using selective scheduling algorithm.
Selective scheduling runs instead of the second scheduler
pass.
-fsel-sched-pipelining
Enable software pipelining of innermost loops during
selective scheduling. This option has no effect unless one
of -fselective-scheduling
or -fselective-scheduling2
is
turned on.
-fsel-sched-pipelining-outer-loops
When pipelining loops during selective scheduling, also
pipeline outer loops. This option has no effect unless
-fsel-sched-pipelining
is turned on.
-fsemantic-interposition
Some object formats, like ELF, allow interposing of symbols
by the dynamic linker. This means that for symbols exported
from the DSO, the compiler cannot perform interprocedural
propagation, inlining and other optimizations in anticipation
that the function or variable in question may change. While
this feature is useful, for example, to rewrite memory
allocation functions by a debugging implementation, it is
expensive in the terms of code quality. With
-fno-semantic-interposition
the compiler assumes that if
interposition happens for functions the overwriting function
will have precisely the same semantics (and side effects).
Similarly if interposition happens for variables, the
constructor of the variable will be the same. The flag has no
effect for functions explicitly declared inline (where it is
never allowed for interposition to change semantics) and for
symbols explicitly declared weak.
-fshrink-wrap
Emit function prologues only before parts of the function
that need it, rather than at the top of the function. This
flag is enabled by default at -O
and higher.
-fshrink-wrap-separate
Shrink-wrap separate parts of the prologue and epilogue
separately, so that those parts are only executed when
needed. This option is on by default, but has no effect
unless -fshrink-wrap
is also turned on and the target
supports this.
-fcaller-saves
Enable allocation of values to registers that are clobbered
by function calls, by emitting extra instructions to save and
restore the registers around such calls. Such allocation is
done only when it seems to result in better code.
This option is always enabled by default on certain machines,
usually those which have no call-preserved registers to use
instead.
Enabled at levels -O2
, -O3
, -Os
.
-fcombine-stack-adjustments
Tracks stack adjustments (pushes and pops) and stack memory
references and then tries to find ways to combine them.
Enabled by default at -O1
and higher.
-fipa-ra
Use caller save registers for allocation if those registers
are not used by any called function. In that case it is not
necessary to save and restore them around calls. This is
only possible if called functions are part of same
compilation unit as current function and they are compiled
before it.
Enabled at levels -O2
, -O3
, -Os
, however the option is
disabled if generated code will be instrumented for profiling
(-p
, or -pg
) or if callee's register usage cannot be known
exactly (this happens on targets that do not expose prologues
and epilogues in RTL).
-fconserve-stack
Attempt to minimize stack usage. The compiler attempts to
use less stack space, even if that makes the program slower.
This option implies setting the large-stack-frame
parameter
to 100 and the large-stack-frame-growth
parameter to 400.
-ftree-reassoc
Perform reassociation on trees. This flag is enabled by
default at -O
and higher.
-fcode-hoisting
Perform code hoisting. Code hoisting tries to move the
evaluation of expressions executed on all paths to the
function exit as early as possible. This is especially
useful as a code size optimization, but it often helps for
code speed as well. This flag is enabled by default at -O2
and higher.
-ftree-pre
Perform partial redundancy elimination (PRE) on trees. This
flag is enabled by default at -O2
and -O3
.
-ftree-partial-pre
Make partial redundancy elimination (PRE) more aggressive.
This flag is enabled by default at -O3
.
-ftree-forwprop
Perform forward propagation on trees. This flag is enabled
by default at -O
and higher.
-ftree-fre
Perform full redundancy elimination (FRE) on trees. The
difference between FRE and PRE is that FRE only considers
expressions that are computed on all paths leading to the
redundant computation. This analysis is faster than PRE,
though it exposes fewer redundancies. This flag is enabled
by default at -O
and higher.
-ftree-phiprop
Perform hoisting of loads from conditional pointers on trees.
This pass is enabled by default at -O
and higher.
-fhoist-adjacent-loads
Speculatively hoist loads from both branches of an if-then-
else if the loads are from adjacent locations in the same
structure and the target architecture has a conditional move
instruction. This flag is enabled by default at -O2
and
higher.
-ftree-copy-prop
Perform copy propagation on trees. This pass eliminates
unnecessary copy operations. This flag is enabled by default
at -O
and higher.
-fipa-pure-const
Discover which functions are pure or constant. Enabled by
default at -O
and higher.
-fipa-reference
Discover which static variables do not escape the compilation
unit. Enabled by default at -O
and higher.
-fipa-reference-addressable
Discover read-only, write-only and non-addressable static
variables. Enabled by default at -O
and higher.
-fipa-stack-alignment
Reduce stack alignment on call sites if possible. Enabled by
default.
-fipa-pta
Perform interprocedural pointer analysis and interprocedural
modification and reference analysis. This option can cause
excessive memory and compile-time usage on large compilation
units. It is not enabled by default at any optimization
level.
-fipa-profile
Perform interprocedural profile propagation. The functions
called only from cold functions are marked as cold. Also
functions executed once (such as "cold", "noreturn", static
constructors or destructors) are identified. Cold functions
and loop less parts of functions executed once are then
optimized for size. Enabled by default at -O
and higher.