Compaq KAP Fortran/OpenMP
for Tru64 UNIX
User Guide

6.3.6 !$ optimize (0--5)

The optimize directive sets the optimization level, ranging from 0 for minimum optimization to 5 for maximum optimization. You can set the optimization level globally by using the optimize=<integer> command switch. The following shows the meaning of each of the different optimization levels:

Value Disables

0 KAP does not perform loop optimizations.

1 KAP performs only simple analysis and loop optimizations.

2 DO loop interchanging techniques are applied. Lifetime analysis is performed to determine when last-value assignment of scalars is necessary. More powerful data dependence tests are used.

3 KAP distributes loops to optimize only a part of a loop. Special techniques are used to break data dependence cycles that otherwise prevent optimization. More loop interchanging is attempted, such as interchanging of triangular loops. Special-case data dependence tests are used. Special index sets, called wraparound variables, are recognized to uncover more opportunities for optimization.

4 Two versions of a loop are generated, if necessary, to break a data dependence arc. Loop interchanging around reductions is attempted. More exact data dependence tests are allowed.

5 Array expansion and loop fusion are enabled.

Value	Disables
0	KAP does not perform loop optimizations.
1	KAP performs only simple analysis and loop optimizations.
2	DO loop interchanging techniques are applied. Lifetime analysis is performed to determine when last-value assignment of scalars is necessary. More powerful data dependence tests are used.
3	KAP distributes loops to optimize only a part of a loop. Special techniques are used to break data dependence cycles that otherwise prevent optimization. More loop interchanging is attempted, such as interchanging of triangular loops. Special-case data dependence tests are used. Special index sets, called wraparound variables, are recognized to uncover more opportunities for optimization.
4	Two versions of a loop are generated, if necessary, to break a data dependence arc. Loop interchanging around reductions is attempted. More exact data dependence tests are allowed.
5	Array expansion and loop fusion are enabled.

A higher optimization level results in more optimization, along with increased compilation time. Many programs that are written to be easily optimized do not need advanced transformations; with these programs, a lower optimization level will suffice.

6.3.7 !$ roundoff (0--3)

The roundoff directive allows you to specify the amount of difference in roundoff error that is acceptable. Certain reductions are sensitive to the algorithms used to compute them. In particular, if an arithmetic reduction is accumulated in a different order than in the scalar program, the roundoff error is accumulated differently and the final result may differ from that of the original program's output. While the difference is usually insignificant, some restructuring transformations performed by KAP must be disabled to get precisely the same results as the scalar program.

KAP classifies its transformations by the amount of difference in roundoff error that can accumulate. You can decide what level of roundoff error differences is allowable. The roundoff directive value ranges from 0 to 3.

The meaning of each roundoff level is as follows:

Value Meaning

0 Allow no roundoff-changing transformations.

1 Enable expression simplification and code floating. Allow loop interchanging around serial arithmetic reductions. Allow loop rerolling, if -scalaropt > 1.

2 Enable reciprocal substitution.

3 Enable recognition of REAL induction variables. Enable memory management, if -scalaropt=3. INTEGER division rotation is allowed.

Value	Meaning
0	Allow no roundoff-changing transformations.
1	Enable expression simplification and code floating. Allow loop interchanging around serial arithmetic reductions. Allow loop rerolling, if -scalaropt > 1.
2	Enable reciprocal substitution.
3	Enable recognition of REAL induction variables. Enable memory management, if -scalaropt=3. INTEGER division rotation is allowed.

The -roundoff command switch acts like a global !*$* roundoff directive.

6.3.8 !$ scalar optimize (0--3 )

The !*$* scalar optimize directive sets the level of dusty-deck and other serial transformations performed. Unlike the -scalaropt command-line switch, the !*$* scalar optimize directive sets the level of loop-based optimizations (for example, loop fusion) only, and not straight-code optimizations (for example, dead-code elimination).

The meaning of each scalar optimize level is as follows:

Value Meaning

0 No transformations are performed.

1 IF loops are changed into DO loops. Simple code floating out of loops is performed. Forward substitution of variables is performed.

2 The full set of loop-based serial transformations is enabled. These include induction variable recognition, loop rerolling, loop unrolling, loop fusion, and array expansion.

3 Memory management is enabled, if -roundoff=3.

Value	Meaning
0	No transformations are performed.
1	IF loops are changed into DO loops. Simple code floating out of loops is performed. Forward substitution of variables is performed.
2	The full set of loop-based serial transformations is enabled. These include induction variable recognition, loop rerolling, loop unrolling, loop fusion, and array expansion.
3	Memory management is enabled, if -roundoff=3.

6.3.9 !$ unroll( <#it>[,<weight>])

The !*$* unroll directive tells KAP how to unroll, that is, replicate the text of, innermost loops. Outer loop unrolling is part of memory management.

The loops are unrolled according to a formula that counts the number of array references and arithmetic operations in the loop. KAP unrolls the loop until that value equals the <weight> parameter or the number of unrolled iterations reaches the <#it> parameter. The -unroll and -unroll2 command switches act like a global !*$* unroll directive.

Note

The -scalaropt level must be set at 2 or higher for this directive to be enabled.

The <#it> parameter is the maximum number of iterations to unroll. The =0 parameter uses default values to unroll. The =1 parameter means no unrolling. <weight> is the maximum weight in an unrolled loop. <weight> is estimated by counting operands and operators in a loop.

A scalar loop is unrolled until one of the limits is reached. See Chapter 5 and Chapter 9 for detailed examples.

The !*$* unroll directive is valid only for the loop before which it appears.

6.4 Parallel Processing Directives for Automatic Parallelization

The following sections explain directives available in KAP that affect KAP's automatic detection of candidate loops for parallelization.

6.4.1 !$ [no]concurrentize

The !*$* concurrentize directive enables parallel execution of loops. The !*$* noconcurrentize directive disables parallel execution until the next !*$* concurrentize directive or until the beginning of the next program unit.

The !*$* [no]concurrentize directive overrides any -[no]concurrentize command-line switch. For example, if you specify the !*$* noconcurrentize directive, KAP disables parallel execution regardless of any -concurrentize command-line switch.

Two-version loops requiring conditional parallel execution may run more slowly than their scalar originals due to the evaluation of the condition. In these cases, you may prefer using either the -noconcurrentize command-line switch, if the program contains predominantly short loops, or the !*$* noconcurrentize directive for specific loops.

6.4.2 !$ minconcurrent (0--999999)

The minconcurrent directive sets the parallel execution threshold for KAP. Each parallelizable DO loop with bounds unknown at compilation time becomes a two-version loop. At run time KAP decides whether to execute the loop in parallel or in serial mode. The higher the minconcurrent value the more iterations and/or statements the loop nest must have in order to be run in parallel.

6.5 Inlining and IPA

The following sections explain the function of the inlining and IPA directives.

6.5.1 !$ [no]inline [here|routine|global] [(name [,name...])]

See Section 6.5.2.

6.5.2 !$ [no]ipa [here|routine|global] [(name [,name...])]

The !*$* inline and !*$* ipa directives allow you to manually select which call sites of which routines are to be inlined or analyzed, respectively. The NO forms select CALLs and function references that are not to be inlined/analyzed, regardless of any -inline or -ipa command switch.

These directives are ignored by default. They are enabled when you specify any inlining or ipa command switch, respectively, on the command line. The -inline_manual and -ipa_manual command switches are provided to enable these directives without activating the automatic inlining and analysis algorithms.

The optional scope parameter sets how much of the program the directive applies to. HERE means the next statement only, ROUTINE means the rest of the program unit, and GLOBAL means the entire source file. If none of these is given, the directive applies only to the next statement.

The optional names are names of the subroutines and functions to which the directive applies. If no list is provided, the directive applies to all subroutine CALLs and function references within the scope of the directive.

See Chapter 8 for more information.

6.6 Assertions

The following section explains the function of the assertions directive.

6.6.1 !$ [no]assertions

The !*$* assertions directive tells KAP to accept assertions. The !*$* no assertions directive tells KAP to ignore assertions. The !*$* no assertions directive disables assertions until the next !*$* assertions directive, or the end of the program unit.

Individual assertions are explained in Chapter 7.

6.7 Memory Management

The following sections explain the function of the memory management directive. This is an output directive that KAP uses to pass information on data layout to the compiler or to KAP itself, if the program is processed iteratively. If a program is processed by KAP multiple times, KAP will use the information in the directives it inserted in previous runs in its cache usage optimizations.

If the <var-list> is too long for a single line, it can be continued by putting [c]*$*& starting in column 1 of the continuation line.

Few users will need to insert or modify this directive by hand.

6.7.1 !$ padding (var-list)

The padding directive identifies the listed arrays and scalar variables as objects that KAP created for the purpose of data alignment. (See the -aggressive command switch, Section 5.6.1.) This directive is for KAP to use when a program is being reprocessed; it will be ignored by the compiler.

The following rules govern the !*$* padding directive:

More than one padding directive may be used within a single program unit.
The !*$* padding directive(s) will be placed immediately after the PROGRAM statement or, if there is no PROGRAM statement, the directive(s) will be placed before the first statement of the program.
A padding object may be routine-local or in a COMMON block.
A padding object may not be in an EQUIVALENCE statement or a dummy argument to the procedure or function.

In the following example, the !*$* padding directive identifies arrays that KAP created to keep the arrays P, PI, PF, K, and Q from causing cache collisions:

REAL FUNCTION EBREMS (ENRES) !*$* padding ( DD4, DD3, DD2, DD1 ) ... DOUBLE PRECISION DD1 (256), DD2 (251), DD3 (251), DD4 (251) ... COMMON /KINEM/PI, DD1, PF, DD2, P, DD3, K, DD4, Q

6.7.2 !$ storage order (var-list)

The storage order directive specifies the relative order that storage should be allocated for the listed routine-local variables and arrays. By appropriately positioning the arrays, cache collisions can be reduced. If the compiler does not interpret the storage order directive, a loss of performance results, but the program will nonetheless generate the correct results.

The rules governing the use of the !*$* storage order directive are the following:

More than one storage order directive may be used within a single program unit. Each directive may be interpreted separately.
The storage order directive(s) will be placed immediately after the PROGRAM statement or, if there is no PROGRAM statement, before the first statement of the program unit.
An object listed in a storage order must be local to the program unit.
An object listed in a storage order must not be:
- Mentioned in another storage order directive
- An element of a COMMON block
- A dummy argument to the procedure or function
Variables and arrays whose values are retained between procedure or function invocations (for instance, by being in SAVE statements) may not be included in the same storage order directive with variables and arrays that are not.
Only one object from an EQUIVALENCE class may be included in storage order directives.
KAP may generate as many !*$* storage order directives as it considers useful. Specifically, the number will not necessarily be limited to two (for SAVEd and non-SAVEd variables).
Specific implementations may place additional restrictions on the storage order directive.

To interpret a !*$* storage order directive, the compiler must place the named objects in memory in the order listed. This is the same order as they would be placed in a COMMON block. Thus, on a machine with 4 bytes per REAL variable:

!*$* storage order (A1,A2,A3) REAL A1(100), A2(3), A3(200) A1 would be placed at some address (for example, address X) A2 would be placed at X+100*4 A3 would be placed at X+100*4+3*4

Both static and stack-based storage schemes are allowed, as long as all of the objects in a single storage order directive are placed in the same scheme.

Chapter 7
Assertions

KAP assertions enable the programmer to provide KAP with information about the program that would not normally be known at compilation time. Although many KAP users run the product without assertions, sometimes assertions can improve the optimization results. Use assertions only where speed is essential and you understand the application program well.

KAP does not guarantee that an assertion will have an effect. The information provided by the assertion will be noted, and if that information will help, it will be used.

To understand the process KAP uses in interpreting assertions, it is necessary to understand assumed dependencies. In the following loop, where X is an array, n and m are scalars, and nothing is known about the relationship between n and m, there are two types of dependencies:

DO 10 i=1,n 10 X(i) = X(i-1) + X(m)

Between X(i) and X(i-1) there is a FORWARD dependence, and the distance is known to be one. Between X(i) and X(m), KAP tries to find a relation, but cannot, because it does not know the value of m in relation to n. The second dependence is called an ASSUMED dependence, because it is assumed but cannot be proven to exist.

Assertions can be unsafe, because KAP cannot check the correctness of the information provided. If you specify an incorrect assertion, then the KAP generated code may give very different results than the original program. If unsafe assertions are the suspected cause of a misbehaving program, all assertions can be ignored (treated as comments) by using the -directives command switch without the a switch or the !*$* no assertions directive.

As with directives, an assertion placed before any comments or statements in the program is treated as a global assertion. That is, it is treated as if it were repeated at the top of each program unit in the file. Some assertions, for example, !*$* assert relation or !*$* assert permutation, include variable names. If these are specified as global assertions, the assertion is used in a program only when those variable names appear in common blocks or are dummy argument names to the subprogram. Global assertions cannot be used to make relational or permutation assertions about variables that are local to a subprogram.

Many assertions, like directives, are active until the end of the program or until overridden by another assertion. Other assertions are in effect only for the DO loops before which they appear. This type of assertion would apply to the next DO loop, but not to its nested loops. Other assertions are active within a program unit, regardless of where they appear in that program unit.

You can apply assertions collectively to a series of loops and arrays by enclosing them in a directive block. Because KAP treats the directive block as one loop, the assertions you want to be active on the loops inside the block must immediately precede the !*$* beginblock directive. Assertions immediately preceding directive blocks override previously set directives and assertions for the duration of the block, as follows:

If you want to use different assertions for individual loops and arrays, enclose each loop or array in its own directive block.

7.1 KAP Assertions

Table 7-1 lists KAP assertions and their duration:

Table 7-1 KAP Assertions
Assertion Duration

!*$* assert [no]argument aliasing Section 7.2.1 until reset

!*$* assert [no]bounds violations Section 7.2.2 until reset

!*$* assert do (concurrent) Section 7.3.2 next loop

!*$* assert do (concurrent call) Section 7.3.3 next loop

!*$* assert do (serial) Section 7.3.4 see text

!*$* assert do prefer (concurrent) Section 7.3.5 next loop

!*$* assert do prefer (serial) Section 7.3.6 next loop

!*$* assert [no]equivalence hazard Section 7.2.3 until reset

!*$* assert [no]last value needed Section 7.2.4 until reset

!*$* assert permutation ( <name> ) Section 7.2.5 next loop

!*$* assert no recurrence ( <name> ) Section 7.2.6 next loop

!*$* assert relation (<name>.xx.<variable/constant>) Section 7.2.7 next loop

!*$* assert no sync Section 7.2.8 next loop

!*$* assert [no] temporaries for constant arguments Section 7.2.9 until reset

**Table 7-1 KAP Assertions**
Assertion	Duration
!$ assert [no]argument aliasing Section 7.2.1	until reset
!$ assert [no]bounds violations Section 7.2.2	until reset
!$ assert do (concurrent) Section 7.3.2	next loop
!$ assert do (concurrent call) Section 7.3.3	next loop
!$ assert do (serial) Section 7.3.4	see text
!$ assert do prefer (concurrent) Section 7.3.5	next loop
!$ assert do prefer (serial) Section 7.3.6	next loop
!$ assert [no]equivalence hazard Section 7.2.3	until reset
!$ assert [no]last value needed Section 7.2.4	until reset
!$ assert permutation ( <name> ) Section 7.2.5	next loop
!$ assert no recurrence ( <name> ) Section 7.2.6	next loop
!$ assert relation (<name>.xx.<variable/constant>) Section 7.2.7	next loop
!$ assert no sync Section 7.2.8	next loop
!$ assert [no] temporaries for constant arguments Section 7.2.9	until reset

For the assertions listed with (name), (name .xx. variable), and (name .xx. constant), the following example illustrates the format of the information required:

!*$* assert permutation (ip) !*$* assert relation (n .gt. m) !*$* assert relation (n .gt. 0)

7.2 Descriptions

The following sections describe each of these assertions.

7.2.1 !$ assert [no]argument aliasing

The !*$* assert no argument aliasing assertion allows KAP to make assumptions about subprogram arguments in a program. According to the Fortran 77 standard, multiple-aliasing of variables is allowed only if no aliases are modified. In the following code example, the subroutine violates the standard, because variable A is multiple-aliased in the subroutine through C and D, and variable X is multiple-aliased through X and E:

COMMON X,Y REAL A,B CALL SUB ( A, A, X ) ... SUBROUTINE SUB ( C, D, E ) COMMON X, Y X = ... C = ... ...

If multiple-aliasing is used in a program, the !*$* assert argument aliasing assertion should be used. The command switch -assume=a acts like a global !*$* assert argument aliasing assertion. An argument aliasing assertion is active until reset, or until the end of the program unit.

7.2.2 !$ assert [no]bounds violations

The !*$* assert bounds violations assertion indicates that array subscript bounds may be violated during execution. If the user has not violated array subscript bounds, this assertion should not be used. A bounds violations assertion is active until reset or until the end of the program. For formal parameters, KAP treats a declared last dimension of (1) the same as (*).

The -assume=b command switch acts like a global !*$* assert bounds violations assertion.

In the following example, the first loop nest is assumed to be standard-conforming, so the loops can both be optimized. The loops can be interchanged to improve memory referencing, because no A(I,J) will overwrite an A(I',J+1). In the second nest, the assertion warns KAP that the loop limit of the first array index (I) may violate the declared array bounds. KAP is cautious and optimizes only the right array index.

DO 100 I = 1,M DO 100 J = 1,N A(I,J) = A(I,J) + B (I,J) 100 CONTINUE C !*$*ASSERT BOUNDS VIOLATIONS DO 200 I = 1,M DO 200 J = 1,N A(I,J) = A(I,J) + B (I,J) 200 CONTINUE

Becomes:

DO 2 J=1,N DO 2 I=1,M A(I,J) = A(I,J) + B (I,J) 2 CONTINUE C !*$* ASSERTBOUNDSVIOLATIONS DO 4 I=1,M DO 3 J=1,N A(I,J) = A(I,J) + B (I,J) 3 CONTINUE 4 CONTINUE

Note

KAP always assumes that array references will be within the array itself, so the rightmost index will be safe to modify references to.

Contents

Index

Compaq KAP Fortran/OpenMP for Tru64 UNIXUser Guide

6.3.6 !*$* optimize (0--5)

Chapter 7Assertions

Compaq KAP Fortran/OpenMP
for Tru64 UNIX
User Guide

6.3.6 !$ optimize (0--5)

Chapter 7
Assertions