Compaq KAP Fortran/OpenMP
for Tru64 UNIX
User Guide

7.2.3 !$ assert [no]equivalence hazard

The !*$* assert noequivalence hazard assertion tells KAP that equivalenced variables will not be used to refer to the same memory location inside one DO loop nest. Normally, equivalence statements allow different variable names to refer to the same storage location. The -assume=e command switch acts like a global !*$* assert equivalence hazard assertion. The equivalence hazard assertion is active until reset or until the end of the program.

In the following example, if arrays E and F are equivalenced, but you know that the overlapping sections will not be referenced in this loop, using !*$* assert noequivalence hazard allows KAP to optimize the loop:

EQUIVALENCE ( E(1), F(101) ) !*$* ASSERT NO EQUIVALENCE HAZARD DO 10 I = 1,N E(I+1) = B(I) C(I) = F(I) 10 CONTINUE

7.2.4 !$ assert [no]last value needed

Frequently when a scalar is assigned in a loop that is optimized, KAP uses a temporary variable within the optimized loop and assigns the last value to the original scalar if KAP believes that the scalar may be reused before it is assigned again. The !*$* assert nolast value needed assertion lets KAP assume that such last-value assignments are unnecessary. This assertion is active until reset or until the end of the program.

The -assume=l command switch acts like a global !*$* assert last value needed assertion.

At -optimize=2 and higher, KAP performs variable lifetime analysis to determine when last-value assignments are unnecessary. The !*$* assert nolast value needed assertion can be used on a loop-by-loop basis when you do not want to eliminate last-value assignments globally.

7.2.5 !$ assert permutation

The !*$* assert permutation assertion provides KAP with sufficient information to allow KAP to generate optimized code for certain types of indirect addressing (INTEGER arrays used in subscripts). This assertion requires -optimize=4 or higher. The following example shows when it is unsafe to optimize a DO loop:

DO 100 I = 1,N A(IP (I)) = A(IP (I)) + B(I) 100 CONTINUE

KAP cannot safely generate optimized code because it cannot tell if the different values in the index array IP overlap. If all values in I are distinct, the optimized code is correct; if some of the values are the same, the optimized code may be unsafe. You can tell KAP that the values in the index array IP are all distinct with the following assertion:

!*$*ASSERT PERMUTATION( IP ) DO 100 I=1,N A(IP (I)) = A(IP (I)) + B(I) 100 CONTINUE

With the addition of this assertion, KAP knows that it can safely generate optimized code for this loop. If at run time the values of IP are not distinct, the optimized code may generate incorrect results.

7.2.6 !$ assert no recurrence

The !*$* assert no recurrence assertion asks KAP to ignore ALL data dependence conflicts due to the named variable in the following DO loop. KAP makes the final decision whether or not to ignore a data dependence conflict. The following example asks KAP to ignore all dependence arcs caused by the variable X in the loop:

!*$* ASSERT NO RECURRENCE ( X ) DO 10 I=1,M,5 10 X(K) = X(K) + X(I)

Not only does KAP ignore the assumed dependence, but also the real dependence caused by X(K) appearing on both sides of the assignment.

This assertion only applies to the next DO loop. It cannot be specified as a global assertion.

7.2.7 !$ assert relation ( <name> .XX. <variable/constant>)

The !*$* assert relation (where XX=GT,LT,LE, etc) (<name> .XX. <variable/constant>) assertion indicates the relationship between two variables or between a variable and a (possibly signed) constant. When attempting to optimize a loop, KAP occasionally asks about such relationships, as in the following example:

DO 100 I = 4,N A(I) = A(I+M) + B(I) 100 CONTINUE

KAP generates this question in the source code section of the listing file:

Is "M .GE. N" in this loop?

If M will always be greater than N at this point in the program, you can relay this information to KAP using an assertion.

Again, note that KAP cannot check the validity of assertions. With the assertion that M is greater than the loop upper bound, KAP can generate optimized code for this loop. If at run time M turned out to be less than N, the results produced by the optimized code would probably be different than the results produced by the original loop.

This assertion is in effect only for the DO loop before which it appears.

7.2.8 !$ assert no sync

If a loop has been optimized by KAP and you note that KAP was cautious and added unnecessary synchronization code, as shown in the following example, the !*$* assert no sync assertion may improve run-time speedup.

DO 20 I = 1,N A (100 + I) = I C (100 + I) = A (I) D (100 + I) = C (I) 20 CONTINUE

KAP standard optimization synchronizes the loop by breaking it up in order to ensure the assignments occur in a valid order. However, this maximizes DO loop overhead as follows:

DO 2 I=1,N A(I+100) = I 2 CONTINUE DO 3 I=1,N C(I+100) = A(I) 3 CONTINUE DO 4 I=1,N D(I+100) = C(I) 4 CONTINUE

If the value of N will always be less than or equal to 100, you can eliminate the synchronization overhead by adding !*$* assert no sync as follows:

!*$* ASSERT NO SYNC DO 20 I=1,N A(I+100) = I C(I+100) = A(I) D(I+100) = C(I) 20 CONTINUE

7.2.9 !$ assert [no] temporaries for constant arguments

Sometimes, KAP transformations are disabled when KAP is not sure about their effect on the rest of the program. For example, one possible transformation would turn:

SUBROUTINE X(I,N) IF (I .LT. N) I = N END

Into:

SUBROUTINE X(I,N) I = MAX(I,N) END

But, if the actual parameter for I were a constant, CALL X(1,N), it would appear that the value of the constant 1 was being reassigned. (In some older versions of Fortran, the values of constants could be changed in this way.) Without additional information, KAP is cautious and performs no argument-changing transformations within the subroutine.

Most compilers automatically put constant actual arguments into temporary variables to protect against this case. The assertion !*$* assert temporaries for constant arguments or the command switch -assume=c (the default) inform KAP that constant parameters are protected. The assertion !*$* assert no temporaries for constant arguments directs KAP to avoid transformations that might change the values of constant parameters.

7.3 Parallel Processing Assertions that Guide Automatic Parallelization

The following sections describe assertions available in the multiprocessor version of KAP.

7.3.1 !$ assert concurrent call

The !*$* assert concurrent call assertion tells KAP the subroutine calls and function references in the loop immediately following this assertion can execute in parallel. It causes KAP to ignore all potential data dependencies due to subroutine arguments. This assertion does not apply to nested or surrounding loops.

The following code example shows how !*$* assert concurrent call works:

!*$* ASSERT CONCURRENT CALL DO 10 I=.. ... CALL S1 ... 10 CONTINUE

You can use !*$* assert concurrent call to override default preprocessor action. In the following code example, !*$* assert concurrent call tells the compiler it is safe to make a parallel call to TEST. Another assertion, !*$* assert prefer do (concurrent), tells the compiler to "prefer" a reordered loop nesting that causes parallel execution over the J loop.

!*$* ASSERT CONCURRENT CALL !*$* ASSERT PREFER DO (CONCURRENT) DO J = 1,N,64 DO K = 1,N,64 DO I = 1,N,64 CALL TEST('N','N', & MIN0(64, N-I+1),MIN0(64,N-J+1),MIN0(64,N-K+1), & 1.0D0, & A(I,K),N+1, & B(K,J),N+1, & 1.0D0, & C(I,J),N+1 & ) ENDDO ENDDO ENDDO

Caution

The !*$* assert concurrent call assertion tells KAP to assume that all external subroutine/function calls are thread-reentrant. This will override KAP default behavior, which is to assume that all external subroutines/functions are NOT thread-reentrant or thread-safe. If the external subroutine/functions are not thread-safe, and you use !*$* assert concurrent call, your program may not execute correctly. For example, local variables in subroutines are thread-safe only if they are stored as thread-specific data. See the Guide to DECthreads for further information on thread-safe subroutines.

For further information about !*$* assert prefer do (concurrent), see Section 7.3.5.

KAP does not generate parallel code if you use the -noconcurrent command switch.

7.3.2 !$ assert do (concurrent)

The !*$* assert do (concurrent) assertion tells KAP to prefer to ignore assumed dependencies and to execute the DO loop immediately following this assertion in parallel.

This assertion says nothing about the concurrency threshold for the loop. This means KAP continues to honor the dependencies it finds. For example, in the following loop KAP ignores !*$* assert do (concurrent) because there is a dependence on A:

!*$* ASSERT DO (CONCURRENT) DO 100 I = 4, N A (I) = A (I-4) + B (I) 100 CONTINUE

Strictly speaking, you could parallelize the loop enabling KAP to put the entire loop body inside a critical section. The parallelized loop would run more slowly than the original serial version, however.

The following code example shows how !*$* assert do (concurrent) tells KAP to ignore an assumed dependence, the I+M array index, and to make the stride-1 loop concurrent:

PROGRAM TEST REAL A(100,100) !*$* ASSERT DO (CONCURRENT) DO 11 I = 1,N DO 12 J = 1,N A(I,J) = A(I+M,J) 12 CONTINUE 11 CONTINUE END

Using the assertion and processing with -unroll=1 and -conc, KAP generates the following code:

PROGRAM TEST REAL A(100,100) SAVE M, N EXTERNAL PKTEST0 INTEGER II5, II4 PARAMETER (II5 = 1, II4 = 3) !*$* ASSERT DO( CONCURRENT ) CALL mppfrk (PKTEST0,II4,N,A,M) CALL mppend END SUBROUTINE PKTEST0 (MPPID, MPPNPR, N, A, M ) AUTOMATIC II3, II2, II1, J, I INTEGER II3, II2, II1, J, I, MPPNPR, MPPID, M, N REAL A(100,100) INTEGER II5, II4 PARAMETER (II5 = 1, II4 = 3) II3 = (N - 1) / MPPNPR + II5 II1 = (MPPID * II3) + 1 II2 = MIN0 (N, II1 + II3 - II5) DO 2 I=II1,II2,II5 DO 2 J=1,N A(I,J) = A(I+M,J) 2 CONTINUE END

Another example of when to use !*$* assert do (concurrent) follows. The DO loop can execute safely in parallel, given the value of M is greater than the value of N. Preceding the loop with !*$* assert do (concurrent) allows parallel execution to happen.

!*$* ASSERT DO (CONCURRENT) DO I = 1,N X(I) = X(I+M) ENDDO

KAP does not generate parallel code if you use the -noconcurrent command-line switch.

7.3.3 !$ assert do (concurrent call)

The !*$* assert do (concurrent call) assertion tells KAP to execute both the !*$* assert do (concurrent) and the !*$* assert concurrent call assertions in the immediately following loop.

7.3.4 !$ assert do (serial)

The !*$* assert do (serial) assertion forces the loop immediately following this assertion to be serial. Additionally, it restricts optimization by forcing all enclosing loops to be serial. Inner loops and other loops inside the same enclosing loop nest, but not enclosing the serial loop, may be optimized. The following code example shows how !*$* assert do (serial) works:

DO 100 I=1,N DO 100 J = 1, N !*$* ASSERT DO (SERIAL) DO 200 K = 1, N X(I,J,K) = X(I,J,K) * Y(I,J) 200 CONTINUE DO 300 K = 1, N X(I,J,K) = X(I,J,K) + Z(I,K) 300 CONTINUE 100 CONTINUE

The assertion forces the DO 100 I loop, the DO 100 J loop, and the DO 200 K loop to be serial. The DO 300 K loop can still be parallelized. In this case, KAP does NOT distribute the I or J loops to try to get a larger optimizable loop.

The following loop code shows an example of when to use !*$* assert do (serial). X and Y must not process in parallel because they are equivalenced and they overlap in memory. Using !*$* assert do (serial) stops parallel execution of the loop.

!*$* ASSERT DO (SERIAL) DO I = 1,N X(I) = Y(I) ENDDO

See also the !*$* assert do prefer (concurrent) assertion, Section 7.3.5, and !*$* assert do prefer (serial), Section 7.3.6.

7.3.5 !$ assert do prefer (concurrent)

The !*$* assert do prefer (concurrent) assertion tells KAP to prefer parallel ordering for the DO loop immediately following this assertion. The following code example shows how the assertion works:

!*$* ASSERT DO PREFER (CONCURRENT) DO 10 I = 1, M DO 10 J = 1, N 10 X(I,J) = X(I,J) + Y(I,J)

The assertion tells KAP to prefer any ordering where the DO 10 I loop is parallel.

The !*$* prefer (concurrent) assertion does not mention anything about assumed dependencies. If KAP finds dependencies, it does not perform optimization.

This assertion is valid only for the DO loop it precedes. The -noconcurrent switch disables the generation of parallel code.

7.3.6 !$ assert do prefer (serial)

The !*$* assert do prefer (serial) assertion tells KAP to prefer serial ordering for the DO loop immediately following this assertion. Unlike !*$* assert do (serial), assert do prefer (serial) does not inhibit optimization of outer loops.

The following code example shows how !*$* assert do prefer (serial) works:

DO 100 i=1,N DO 100 J = 1, N !*$* ASSERT DO PREFER (SERIAL) DO 200 K = 1, N X(I,J,K) = X(I,J,K) * Y(I,J) 200 CONTINUE DO 300 K = 1, N X(I,J,K) = X(I,J,K) + Z(I,K) 300 CONTINUE 100 CONTINUE

The !*$* assert do prefer (serial) assertion allows optimization over the whole loop nest, while trying to keep the DO 200 K loop serial. Compare this example with the !*$* assert do (serial) example to see how the two assertions produce different results with identical loop structures.

Chapter 8
Inlining and IPA

This chapter presents additional information about the KAP command switches and inline directives used to inline subroutines and functions, or to perform Interprocedural Analysis (IPA).

Inlining is the process of replacing a subroutine CALL or function reference with the text of the routine. This eliminates the overhead of the call, and can assist other optimizations by making relationships between arguments, returned values, and the surrounding code easier to find.

IPA is the process of inspecting called subroutines and functions for information on relationships between arguments, returned values, and global data. IPA can provide many of the benefits of inlining, but without replacing the CALL or function reference.

The rest of this chapter covers the inlining and IPA command switches and directives, related command switches, examples of their use, and information about program constructs that inhibit inlining. Inlining and IPA are symmetrical from the command-line standpoint --- there are parallel sets of commands and directives for them. In many places in this chapter the term "inlining" applies to both inlining and IPA.

8.1 Inlining and IPA Command Switches

There are two phases to inlining --- defining the universe of inlinable routines and selecting which routines in that universe to inline or analyze. The -inline_from... and -ipa_from... switches define the universe of inlinable routines. The -inline, -ipa, -..._looplevel, and -inline_depth switches select which of the available routines are to be inlined/analyzed. The -inline_create and -ipa_create switches set up collections of routines for inclusion in later KAP runs.

All of the inlining and IPA command switches are listed in the following sections. The short forms of their names are in brackets.

Note

Many of these switches have arguments that are lists of routine names or file names. The elements of these lists may be separated by either commas or colons. Multiple element lists must be enclosed in parentheses.

8.1.1 inline_from/ipa_from Switches

There are four switches, as follows:

-inline_from_files=<list> [-inff] -inline_from_libraries=<list> [-infl] -ipa_from_files=<list> [-ipaff] -ipa_from_libraries=<list> [-ipafl]

Where <list> is one or more of the following: source file name, library file name, and directory, separated by commas. The default is current source file.

You can distinguish different types of files by their extensions. For example, -inline_from_files=xj.f,yy.f,.../mrtn would look for routines in the Fortran 90 source files xj.f and yy.f, and in Fortran 90 source files in the directory .../mrtn. Including the directory .../mrtn in the -..._from_files switches can be thought of as shorthand for the notation .../mrtn*.f. Do not use wildcard characters in a -inline_from_... list.

The -..._libraries versions of these switches take as their arguments lists of subprogram libraries and directories containing such libraries.

KAP recognizes the type of file from its extension, or lack of one, as follows:

.f, .ftn, .f --- Fortran 90 source
.klib --- Library from -inline_create -ipa_create (see Section 8.1.2)
other --- Directory

If multiple -inline_from... [-ipa_from...] switches are given, their lists are concatenated to get a bigger universe.

Routine name references are resolved by a search in the order that files appear in -inline_from... or -ipa_from... switches on the command line. Libraries are searched in their original lexical order. Multiple -inline_from... -ipa_from... lists are searched in the order that they appear on the command line.

8.1.2 Library Creation

Use the following switches to create a preprocessed library:

-inline_create=<library name> [-incr] -ipa_create=<library name> [-ipacr]

To specify an existing library file to inline from, use the
-inline_from_libraries= or the -ipa_from_libraries= switch.

The default source for routines to put into the library is the current source file. If -inline_from... or -ipa_from... is specified, the routines in the listed files are the ones put into the library. This provides a method to combine or expand libraries --- just include the old library(ies) in an -inline_from_libraries or -ipa_from_libraries switch, along with an -inline_from_files or -ipa_from_files switch giving source files containing any new subroutines and functions.

Routines are included in libraries in the order in which they appear in the input file(s). This is to make sure that if multiple routines with the same name are in the same source file, the one chosen for inlining will be the one that you expect from the algorithm under the -inline_from... switch.

A library created with -inline_create will work for inlining or IPA, because it is just partially reduced source code, but a library made with -ipa_create may not appear in an -inline_from= list. It is flagged with a Warning message.

If no library name is given, the name used is file.klib, where file is the input file name with any trailing .f, .for, or .ftn stripped off.

When creating a library, only one -inline_create (-ipa_create) switch may be given. That is, only one library may be created per KAP run. If the library file existed prior to running KAP, it is overwritten.

When -inline_create (-ipa_create) is specified on the command line, no transformed code file will be generated.

See the description of the -inline_from_libraries and -ipa_from_libraries switches for information about using libraries created with these switches.

If no -inline (-ipa) switch is given, the default will be to include all the routines from the inlining universe in the library, if possible. If -inline=<name list> or -ipa=<name list> is specified, only the named routines will be included in the library. See Section 8.5 for a list of conditions that can prevent a routine from being inlined.

An example of inlining from the library created previously is included in Section 8.2.

Contents

Index

Compaq KAP Fortran/OpenMP for Tru64 UNIXUser Guide

7.2.3 !*$* assert [no]equivalence hazard

Chapter 8Inlining and IPA

8.1.1 inline_from/ipa_from Switches

Compaq KAP Fortran/OpenMP
for Tru64 UNIX
User Guide

7.2.3 !$ assert [no]equivalence hazard

Chapter 8
Inlining and IPA