In the description that follows, the term “standard optimizations” is used to collectively refer to all optimization measures that can only be activated or deactivated globally, i.e. which cannot be controlled individually.
In contrast to the standard optimizations, some optimization steps such as loop expansion, and inlining can be individually controlled. Selective control is provided in these cases because, among other things, the benefits of such optimization (e.g. quicker execution times) may need to be weighed against other disadvantages (e.g. an increase in the size of the generated module or longer compilation times).
In some cases, setting the LOOP-UNROLLING and INLINING options may actually degrade performance. It is therefore advisable to determine the best settings for eachapplication by running some tests.
Standard optimizations
The C++ compiler performs the following optimizations:
evaluation of constant expressions at compile time
optimization of subscript computation in loops
elimination of superfluous assignments
propagation of constant expressions
elimination of redundant expressions
optimization of branches to unconditional branch instructions
The term “base block” is fundamental to the concept of optimization and refers to a maximum, unbranched command sequence. Such a command sequence has precisely one entry point and one exit.
Base blocks are the units via which most optimization steps with the C++ compiler are implemented.
Evaluation of constant expressions at compile time
By evaluating expressions for which operand values are known during compilation, the execution of commands is relocated from the program run to the compiler run. Consequently, the program executes faster.
Evaluations at compile time involve integer arithmetics and relational operations.Example
Before optimization
I = 1 + 2;
I = 2 * 4;
1 <= 5;
After optimization
I = 3;
I = 8;
<TRUE>
Elimination of superfluous assignments
If an assignment to a variable v is followed by a change in the value of v without the original value ever being used, this assignment is eliminated. Assignments to variables whose values are of no further consequence to the program run are also eliminated.
Example
Before optimization
i = 5;
i = 3;
After optimization
i = 3;
Propagation of constant expressions
If a variable whose value is already known at the time of compilation is used in an expression, this variable will be replaced by the appropriate value.
Example
Before optimization
a = 3;
i = a;
After optimization
a = 3;
i = 3;
This optimization also has a bearing on other optimization techniques. After the successful propagation of a variable, the original assignment may be deleted, or a new constant expression that has already been evaluated at compile time may appear.
Elimination of redundant expressions
If the value of an expression occurring within a base block is already known at the time of compilation as a result of an earlier calculation, then this expression is redundant. To avoid a repeat calculation, the expression is assigned to a new variable and replaced by this new variable wherever it occurs.
Example
Before optimization
a = b * c + 20;
e = b * c - 10;
After optimization
h = b * c;
a = h + 20;
e = h - 10;
Optimization of subscript computation in loops
If an array element is subscripted in a loop via an iteration variable, the multiplication required in order to compute the address of the array element is reduced to additions.As a rule, the address of the array element is calculated as
base address + index * length of an array element
Before entering a loop, the optimizer supplies an address variable with the address of the array element that was referenced during the first iteration. With every following iteration, this address variable is then incremented with a fixed length of one array element.
This optimization technique is especially rewarding in the case of multidimensional arrays, since one multiplication step can be eliminated per dimension under optimum conditions.
Optimization of branches to unconditional branch instructions
In branch instructions that have an unconditional branch as their destination, the branch address of the unconditional branch is substituted for the original branch address. This also helps eliminate superfluous code, i.e. code which specifies addresses that cannot be accessed.
Example
Before optimization
goto lab1; ... lab1: goto lab2; ... lab2:
After optimization
goto lab2; ... lab1: goto lab2; ... lab2:
Inline substitution of user-defined functions
The inline substitution of a function eliminates the need to call the function at runtime, since the function code is integrated into the source code at the point of call. This reduces the administrative overhead and code required for function calls and returns (e.g. saving and restoring registers, allocating stacks, writing parameters to the transfer area, etc.) and can thus produce substantial savings in execution time. In addition, the inlining of functions enhances the effect of the standard optimizations due to the larger context.
Inlined static
functions are deleted.
Inline substitution does, however, increase the size of the generated modules. It may produce extremely large functions with an increasing number of variables as potential recipients of registers that cannot all be supplied due to the limited number of registers available. This fact must be weighed against the advantages of this method of optimization.
Inline substitution is particularly suitable for small functions where the overhead for the function call and return constitutes a substantial part of the actual function code.
Functions with the following attributes can never be expanded inline:
functions with a variable number of parameters (cf. va_... macros in <stdarg.h>)
functions containing setjmp calls
recursive functions
Expansion of loops
Loop expansion reduces the number of iterations in a loop by repeating, i.e. “expanding” the body of the loop (statement block) one or more times. Since each iteration of a loop requires loop control statements to test the value of the current iteration and to branch accordingly, a reduction in the number of iterations also improves execution time.
For example, if the body of a loop is doubled (expansion factor 2), the overhead for loop control statements is reduced by half. In general, the following rule applies: an expansion factor of n reduces the overhead for loop control statements to 1/n.
The size of the generated module is, however, increased by the repetition of code. By default, the optimizer uses an expansion factor of 4.
Expanding the body of a loop also creates the potential for new optimization. For example, the extended base blocks can be made more efficient by the propagation of constant expressions or the elimination of redundant expressions.
Example of a loop expansion with expansion factor 4
Before expansion:
i = 0; while(i < 80) { a[i] = b[i+1]; i++; }
After expansion:
i = 0; while(i < 80) { a[i] = b[i+1]; i++; if(!(i < 80)) goto end; a[i] = b[i+1]; i++; if(!(i < 80)) goto end; a[i] = b[i+1]; i++; if(!(i < 80)) goto end; a[i] = b[i+1]; i++; } end:
Optimizations in the body of the loop are also possible here.