Optimizing your code

Delphi performs several different types of code optimizations, ranging from constant folding and short-circuit Boolean expression evaluation, all the way up to smart linking. The following sections describe some of the types of optimizations performed and how you can benefit from them in your programs.

Constant folding

If the operand(s) of an operator are constants, Delphi evaluates the expression at compile time. For example,

X := 3 + 4 * 2

generates the same code as X := 11 and

S := 'In' + 'Out'

generates the same code as S := 'InOut'.

Likewise, if an operand of an Abs, Chr, Hi, Length, Lo, Odd, Ord, Pred, Ptr, Round, Succ, Swap, or Trunc function call is a constant, the function is evaluated at compile time.

If an array index expression is a constant, the address of the component is evaluated at compile time. For example, accessing Data [5, 5] is just as efficient as accessing a simple variable.

Constant merging

Using the same string constant two or more times in a statement part generates only one copy of the constant. For example, two or more Write('Done') statements in the same statement part references the same copy of the string constant 'Done'.

Short-circuit evaluation

Delphi implements short-circuit Boolean evaluation, which means that evaluation of a Boolean expression stops as soon as the result of the entire expression becomes evident. This guarantees minimum execution time and usually minimum code size. Short-circuit evaluation also makes possible the evaluation of constructs that would not otherwise be legal. For example,

while (I <= Length(S)) and (S[I] <> ' ') do

Inc(I);

while (P <> nil) and (P^.Value <> 5) do

P := P^.Next;

In both cases, the second test isn't evaluated if the first test is False.

The opposite of short-circuit evaluation is complete evaluation, which is selected through a {$B+} compiler directive. In this state, every operand of a Boolean expression is guaranteed to be evaluated.

Constant parameters

Whenever possible, you should use constant parameters instead of value parameters. Constant parameters are at least as efficient as value parameters and, in many cases, more efficient. In particular, constant parameters generate less code and execute faster than value parameters for structured and string types.

Constant parameters are more efficient than value parameters because the compiler doesn't have to generate copies of the actual parameters upon entry to procedures or functions. Value parameters have to be copied into local variables so that modifications made to the formal parameters won't modify the actual parameters. Because constant formal parameters can't be modified, the compiler has no need to generate copies of the actual parameters and code and stack space is saved. Read more about constant parameters on page 76.

Redundant pointer-load elimination

In certain situations, Delphi’s code generator can eliminate redundant pointer-load instructions, shrinking the size of the code and allowing for faster execution. When the code generator can guarantee that a particular pointer remains constant over a stretch of linear code (code with no jumps into it), and when that pointer is already loaded into a register pair (such as ES:DI), the code generator eliminates more redundant pointer load instructions in that block of code.

A pointer is considered constant if it's obtained from a variable parameter (variable parameters are always passed as pointers) or from the variable reference of a with statement. Because of this, using with statements is often more efficient (but never less efficient) than writing the fully qualified variable for each component reference.

Constant set inlining

When the right operand of the in operator is a set constant, the compiler generates the inclusion test using inline CMP instructions. Such inlined tests are more efficient than the code that would be generated by a corresponding Boolean expression using relational operators. For example, the statement

if ((Ch >= 'A') and (Ch <= 'Z')) or

((Ch >= 'a') and (Ch <= 'z')) then ... ;

is less readable and also less efficient than

if Ch in ['A'..'Z', 'a'..'z'] then ... ;

Because constant folding applies to set constants as well as to constants of other types, it's possible to use const declarations without any loss of efficiency:

const

Upper = ['A'..'Z'];

Lower = ['a'..'z']; Alpha = Upper + Lower;

Given these declarations, this if statement generates the same code as the previous

if statement:

if Ch in Alpha then ... ;

Small sets

The compiler generates very efficient code for operations on small sets. A small set is a set with a lower bound ordinal value in the range 0..7 and an upper bound ordinal value in the range 0..15. For example, the following TByteSet and TWordSet are both small sets.

type

TByteSet = set of 0..7; TWordSet = set of 0..15;

Small set operations, such as union (+), difference (-), intersection (*), and inclusion tests (in) are generated inline using AND, OR, NOT, and TEST machine code instructions instead of calls to run-time library routines. Likewise, the Include and Exclude standard procedures generate inline code when applied to small sets.

Order of evaluation

As permitted by the Pascal standards, operands of an expression are frequently evaluated differently from the left to right order in which they are written. For example, the statement

I := F(J) div G(J);

where F and G are functions of type Integer, causes G to be evaluated before F, because this enables the compiler to produce better code. For this reason, it's important that an expression never depend on any specific order of evaluation of the embedded functions. Referring to the previous example, if F must be called before G, use a temporary variable:

T := F(J);

I := T div G(J);

As an exception to this rule, when short-circuit evaluation is enabled (the {$B-} state), Boolean operands grouped with and or or are always evaluated from left to right.

Range checking

Assignment of a constant to a variable and use of a constant as a value parameter is range-checked at compile time; no run-time range-check code is generated. For example, X := 999, where X is of type Byte, causes a compile-time error.

Shift instead of multiply or divide

The operation X * C, where C is a constant and a power of 2, is coded using a SHL instruction. The operation X div C, where X is an unsigned integer (Byte or Word) and C is a constant and a power of 2, is coded using a SHR instruction.

Likewise, when the size of an array's components is a power of 2, a SHL instruction (not a MUL instruction) is used to scale the index expression.

Automatic word alignment

By default, Delphi aligns all variables and typed constants larger than 1 byte on a machine-word boundary. On all 16-bit 80x86 CPUs, word alignment means faster execution, because word-sized items on even addresses are accessed faster than words on odd addresses.

Data alignment is controlled through the $A compiler directive. In the default {$A+} state, variables and typed constants are aligned as described above. In the {$A-} state, no alignment measures are taken.

Eliminating dead code

Statements that never execute don't generate any code. For example, these constructs don't generate code:

if False then statement while False do

statement

Smart linking

Delphi’s built-in linker automatically removes unused code and data when building an .EXE file. Procedures, functions, variables, and typed constants that are part of the compilation, but are never referenced, are removed from the .EXE file. The removal of unused code takes place on a per procedure basis; the removal of unused data takes place on a per declaration section basis.

Consider the following program:

program SmartLink;

const

H: array [0..15] of Char = '0123456789ABCDEF';

var

I, J: Integer;

X, Y: Real;

var

S: string [79];

var

A: array [1..10000] of Integer;

procedure P1;

begin

A[1] := 1;

end;

procedure P2;

begin

I := 1;

end;

procedure P3;

begin

S := 'Borland Pascal';

P2;

end;

begin

P3;

end.

The main program calls P3, which calls P2, so both P2 and P3 are included in the

.EXE file. Because P2 references the first var declaration section, and P3 references the second var declaration, I, J, X, Y, and S are also included in the .EXE file. No references are made to P1, however, and none of the included procedures reference H and A, so these objects are removed.

An example of such a unit is the SysUtils standard unit: It contains a number of procedures and functions, all of which are seldom used by the same program. If a

program uses only one or two procedures from SysUtils, then only these procedures are included in the final .EXE file, and the remaining ones are removed, greatly reducing the size of the .EXE file.

C h a p t e r