Compaq KAP Fortran/OpenMP
for Tru64 UNIX
User Guide


Previous Contents Index

5.1.8 -verbose, -v, (-nov)

This switch prints the passes as they execute with their arguments and their input and output files. Also prints final resource usage in the C-shell time format.

5.2 General Optimization Switches for kapf90

The following sections explain the function of each general optimization switch.

5.2.1 -interchange, -nointerchange, (-interchange)

Use the -interchange switch to enable loop interchanging. KAP enables loop interchange when -interchange is specified and the -optimize switch level is at least 1 or the -scalaropt switch level is 3. If you specify -nointerchange, KAP disables loop interchange regardless of the -optimize or -scalaropt switch settings. Loop interchanging is enabled by default.

5.2.2 -namepartitioning, -namepart, -nnamepart, (-nonamepartitioning)

This switch tells KAP to look at distinct array names and limit the number of arrays that appear in a loop to avoid cache thrashing. That is, this switch breaks a loop containing, for example, references to arrays A and B into two loops. One loop references array A and the other loop references array B.

Two arguments (i and j) used in a -namepartitioning=i,j switch, control name partitioning as follows:

If no arguments appear with the -namepartitioning switch, KAP uses its default values of 2 for the minimum and 8 for the maximum number of partitions.

Before KAP can perform name partitioning, you must specify the switch -scalaropt=n, where n is greater than or equal to 3.

The -nonamepartitioning switch explicitly prevents name partitioning.

5.2.3 -optimize, -o, (-optimize=5)

The -optimize switch sets the base optimization and code analysis level, ranging from 0 (no optimization) to 5 (maximum optimization). The optimization level can also be modified on a loop-by-loop basis by the !*$* optimize (<integer>) directive. Some of the code analysis techniques can be enabled with the -scalaropt switch.

The meaning of each of the different optimization levels is as follows. Each optimization level is cumulative, for example, level 4 performs what is listed below for that level, in addition to what is listed for levels 0--3.

A higher optimization level results in more optimization, more analysis, and more ambitious transformations, along with increased compilation time.

5.2.4 -recursion, -rec, (-norecursion), -norec

The -recursion switch informs KAP that subroutines and functions in the source program may be called recursively (that is, it calls itself or calls another routine that calls it). This affects storage allocation decisions and the interpretation of the -save option. The -recursion switch must be in force in each recursive routine that kapf90 processes, or unsafe transformations could result.

The -norecursion option tells KAP to assume that recursion is not used in the program being processed.

5.2.5 -roundoff, -r, (-roundoff=3)

The -roundoff switch allows you to specify the change from serial roundoff error that is tolerable. If an arithmetic reduction is accumulated in a different order than in the scalar program, the roundoff error is accumulated differently and the final result may differ from that of the original program. While the difference is usually insignificant, certain restructuring transformations performed by KAP must be disabled to obtain exactly the same results as the scalar program. These transformations are discussed further in Chapter 9.

KAP classifies its transformations by the amount of difference in roundoff error that can accumulate so you can decide what level of roundoff error differences is allowable. The -roundoff command switch has the values 0 to 3.

Roundoff levels are cumulative, performing what is listed for each level, as well as what is listed for the lower levels. The meaning of each roundoff level is as follows:

5.2.6 -scalaropt, -so, (-scalaropt=3)

The -scalaropt command-line switch sets the level to which dusty-deck and other scalar transformations are performed. Unlike the -scalaropt command-line switch, the !*$* scalar optimize directive sets the level of loop-based optimizations (for example, loop fusion) only, and not straight-code optimizations (for example, dead-code elimination).

The allowed values and their meanings are as follows:

5.2.7 -skip, -sk, -nsk, (-noskip)

Use the -skip switch following the -routine switch to stop KAP from processing specified routines. KAP writes out unchanged source code for the specified routines. See the description of the -routine switch in Section 5.6.20.

5.2.8 -tune, -tune, (-tune=<architecture>)

The KAP preprocessor determines whether the host architecture is ev4, ev5, or ev6 and then optimizes your program for that architecture by default. In the event you compile a program on one architecture but plan to run it on another, override the default by setting -tune equal to the architecture where the program will run. For example, if you compile a program on ev4 architecture, but plan to run it on ev5, use -tune=ev5.

5.3 Parallel Processing Switches for kapf90

The following sections describe the switches you use to control how the multiprocessor version of KAP prepares programs for parallel execution.

5.3.1 -chunk


This switch modifies, and is used only with, the -scheduling switch. The -chunk switch determines the number of loop iterations that are in a group. Its default value is 1.

5.3.2 -concurrent, -conc, -noconc, (-noconcurrent)

The -concurrent switch directs KAP to restructure the source code for parallel processing.

Setting -noconcurrent disables parallel execution and allows all serial optimizations to take place. You can enable and disable parallel execution on a module-by-module basis using KAP directives or on a loop-by-loop basis using KAP assertions. For more information about parallel processing directives, see Section 6.4. Parallel processing assertions are described in Section 7.3.

Programs containing many loops that require synchronization or programs that have loops with small iteration counts may run slower when parallelized. In these cases disable parallel execution.

Section 3.1.1 summarizes the two methods of parallelization, automatic and combined, that require the -conc switch. Several examples of the -conc switch are in the descriptions of these two methods.

5.3.3 -minconcurrent, -mc, (-minconcurrent=1000)

Executing a loop in parallel incurs overhead that varies with different systems. If a loop has little computational work, the overhead required to set up parallel execution may make the loop execute more slowly than it executes serially. The -minconcurrent switch sets the level of work in a loop above which KAP should execute the loop in parallel. Setting the -minconcurrent switch causes KAP to automatically set the -concurrent switch.

The range of values for -minconcurrent is all integers greater than or equal to 0. The higher the minconcurrent value, the more iterations and/or statements the loop body must have to run concurrently.

At compilation time, KAP estimates the amount of work inside a loop on the basis of loop computations and loop iterations. KAP multiplies the loop iteration count by the sum of the noindex operands/results and the nonassignment operators. KAP compares its estimation with the minconcurrent value. If the estimated amount of work is greater than the minconcurrent value, KAP generates parallel code for the loop. Otherwise, the loop execution is serial. This is called a two-version loop.

If the DO loop bounds are known at compilation time, KAP computes the exact iteration count. However, if the DO loop bounds are unknown, KAP generates a block IF around the parallel code. The block IF allows a run-time decision whether or not to execute the loop in parallel.

To disable the generation of two-version loops throughout the program, use the command-line switch -minconcurrent=0. To disable this action in specific DO loops, use the !*$* minconcurrent(0) directive.

The following loop illustrates this switch using the minconcurrent default of 1000:


DO 10 I = 1,N 
        A(I) = B(I) + C(I) 
   10    CONTINUE 

Becomes:


      IF (N .GE. 425) THEN 
          CALL mppfrk (P$PLP10,0) 
       ELSE 
          DO 2 I=1,N-3,4 
             A(I) = B(I) + C(I) 
             A(I+1) = B(I+1) + C(I+1) 
             A(I+2) = B(I+2) + C(I+2) 
             A(I+3) = B(I+3) + C(I+3) 
   2        CONTINUE 
      ENDIF 

At run time, if the iteration count N is greater than or equal to 425 (1000/4), the concurrent loop executes in parallel; otherwise, it executes serially.

When KAP restructures DO loops whose bounds are not known in a source program named MYPROG.F, it inserts calls to subroutine MPPFRK whose first parameter comes from the sequence PKMYPROG_, PKMYPROG_1, PKMYPROG_2, ...

5.3.4 -parallelio, -nopio, -pio, (-noparallelio)

The -parallelio switch allows parallel execution of loops with I/O. Use this switch when you know the I/O will not execute. An example is a test for an error condition that causes a message to be printed.

Its complement, -noparallelio (short name -nopio), prevents parallel execution of loops containing I/O statements. The default value is -noparallelio.

5.3.5 -pdefault


This switch tells KAP how to process variables that are not listed in an OpenMP data environment directive. Furthermore, it is used only during directed parallelization. The values of this switch and their meanings are next.

5.3.6 -psyntax


This switch specifies the set of parallel directives that KAP recognizes. Its values are openmp (the default) and kap.

The setting -psyntax=kap is useful if you are migrating applications that contain KAP Parallel Computing Forum (PCF) directives. The KAP parallel runtime library (libkmp-osfp10.a) will be used to implement the multithreading.

The setting -psyntax=openmp is required if your applications use OpenMP directives. Usage of OpenMP directives implies one of the following conditions:

The compiler will be used to implement the multithreading.

5.3.7 -scheduling=<list>, -sched=<list>, (-scheduling=e)

The -scheduling switch tells KAP the kind of scheduling to use for loop iterations on a multiprocessor system. The -scheduling options are as follows:

5.4 Fortran Dialect Switches for kapf90

The following sections explain the function of each Fortran dialect switch.

5.4.1 -align_common, -align_common, (-align_common=8)

The -align_common switch aligns data elements in COMMON blocks. Its integer value represents the boundary size in bytes. The default is -align_common=8.

5.4.2 -align_struct, -align_struct, (-align_struct=4)

The -align_struct switch aligns subfields. Its integer value represents the boundary size in bytes. The default is -align_struct=4.

5.4.3 -assume, -a, (-assume=cel), -noassume, -na

The -assume switch tells KAP to make certain global assumptions about the program being processed. Most of these can also be controlled by various assertions (see Chapter 7). The -assume switch settings and the corresponding KAP assertions are as follows:

By default, KAP assumes that a program conforms to the Fortran 77 standard, that is, -assume=el. The default includes -assume=c to simplify some analysis and inlining.

To disable all the above assumptions, enter -noassume on the command line.

5.4.4 -datasave, -ds, (-datasave), -nodatasave, -nds

The -datasave switch tells KAP to treat local variables in a subroutine or function that appear in DATA statements as if they were also in SAVE statements. That is, their values will be retained between invocations of the subroutine or function. This is the practice of many commercial Fortran compilers. This choice affects certain optimizations performed by KAP.

The negative switch, -nodatasave, complies with the Fortran 77 standard.

See also the -save command switch.

5.4.5 -dlines, -dl, (-nodlines), -ndl

The -dlines switch allows a D in column 1 to be treated like a character space. The rest of that line will then be parsed as a normal Fortran 90 statement. By default, KAP treats these lines as comments. This switch is useful for the inclusion or exclusion of debugging lines. Data dependence relationships may be different when the D lines are included.

In the following example, the -nodlines default would cause the WRITE statement to be treated as a comment:


DO 10 I = 1,N 
         A (I) = B (I) 
D    WRITE (*,*) A (I) 
10   CONTINUE 

But when -dlines is specified, KAP sees a WRITE statement and will not optimize the whole loop as it is:


DO 2 I=1,N 
       A(I) = B(I) 
2  CONTINUE 
      DO 3 I=1,N 
      WRITE (*, *) A(I) 
3     CONTINUE 

5.4.6 -escape, -noescape, (-escape)

The -escape switch causes KAP to scan escape characters in input lines.

5.4.7 -freeformat, -ff, (-nofreeformat)


The -freeformat command-line switch removes the standard column restrictions for Fortran source code. For example, source files can be up to 132 columns and use an ampersand (&) at the end of the line to indicate continuation. See the Fortran Language Reference manual for more information.

The -freeformat switch is off by default, and the usual Fortran 77 conventions apply. For example, files are truncated after column 72 unless you specify the Compaq Fortran flag -extend_source. A character (except a zero or a blank) in column 6 indicates a continuation line.

5.4.8 -integer, -int, (-integer=4)

This switch specifies a size in bytes, N, for the default size of INTEGER variables. When N=2 or 4, take INTEGER*N as the default INTEGER type. When N=0, use the ordinary default length for INTEGER variables.

Executing kf90 and explicitly calling the compiler switch -noi4 will cause KAP to be called with the command switches -integer=2 and -logical=2.

5.4.9 -intlog, (-intlog)

The -intlog switch enables the mixing of integer and logical operands in expressions. When integer operands are used with logical operators, the operations are performed in a bitwise manner. When logical operations are used with arithmetic operators, the operands are treated as integers.

5.4.10 -kind, (-kind), (-kind=4)

The -kind switch establishes the value for the Fortran 90 KIND type parameter used when KIND has not been specified or KIND=0 is specified. -kind applies to all data types: logical, integer, real, and complex. The values for -kind are 4 or 8 with 4 being the default. The -kind switch allows you to change the underlying precision of compuations without violating the Fortran 90 standard constraints that default logical, default integer and default real occupy the same amount of storage and that default double precision and default complex occupy twice the storage of default real.

5.4.11 -logical, -log, (-logical=4)

This switch specifies a size in bytes, N, for the default size of LOGICAL variables. When N=1, 2, or 4, take LOGICAL*N as the default LOGICAL type. When N=0, use the ordinary default length for LOGICAL variables.

Executing kf90 and explicitly calling the compiler switch -noi4 will cause KAP to be called with the command switches -integer=2 and -logical=2.

5.4.12 -natural, -nat, -nonatural

This switch no longer exists. Its replacement is the pair of switches -align_common, described in Section 5.4.1, and -align_struct, described in Section 5.4.2.


Previous Next Contents Index