Compaq KAP Fortran/OpenMP
for Tru64 UNIX
User Guide


Previous Contents Index

3.4 Combined Automatic and Directed Parallelization Using the kf90 Driver

Parallelization --- that is, creating an executable file that executes as a multithreaded application on symmetric multiprocessor systems --- by the combined method is most useful for large programs in which you want to explicitly control the parallelization of some DO loops by inserting parallel directives while letting Compaq KAP automatically parallelize the remaining loops. The combined method is a merge of the automatic and directed methods.

The combined method applies to DO loops both with and without parallel directives. Consider a program with the following loops:


  C$OMP PARALLEL 
  DO WHILE (I ...               DO WHILE (J ... 
  ...                           ... 
  END DO                        END DO 
  C$OMP END PARALLEL 

The DO WHILE I loop is surrounded by parallel directives that have been inserted by the programmer. These directives will be passed on unmodified to the compiler.

The DO WHILE J loop is not surrounded by parallel directives. Compaq KAP first examines the J loop according to data dependency tests and the value of the -minconcurrent switch. If the J loop meets these requirements, Compaq KAP will parallelize the loop by inserting OpenMP directives which will then be passed on to the compiler. The appropriate command line to use to process this program is:

kf90 -fkapargs='-concurrent' my_prog.f

3.4.1 Changing Source Programs

You insert OpenMP directives around those DO loops that you want to explicitly parallelize.

In addition, you can insert guiding assertions around loops that you want to help Compaq KAP to parallelize automatically. Compaq KAP cannot automatically parallelize loops with data dependencies between loop iterations and loops with calls to external routines. You can help Compaq KAP detection of these loops by placing parallel processing assertions and parallel processing directives (each beginning with !*$*) in the source program. These assertions and directives are:


 !*$* assert concurrent call 
 !*$* assert do (concurrent) 
 !*$* assert do (concurrent call) 
 !*$* assert do (serial) 
 !*$* assert do prefer (concurrent) 
 !*$* assert do prefer (serial) 
 !*$* [no]concurrent 
 !*$* minconcurrent 

3.4.2 Giving Command Line Switches

Command line switches you can give to Compaq KAP that affect its transformation of DO loops are:

3.4.3 Directing the Compilation and Linking Process

To construct a program for parallel execution via the combined method, you normally need to give only the -concurrent switch to the kf90 command as follows:

kf90 -fkapargs='-concurrent' my_prog.f

The -concurrent switch tells KAP to automatically parallelize appropriate DO loops. The -concurrent switch also sets the compiler and linker switches needed for parallelization. Because -psyntax=openmp is set by default, KAP inserts OpenMP directives around loops that it automatically detects are good candidates for parallelization. The actual parallelization is done by the compiler which processes the OpenMP directives inserted automatically by KAP and the OpenMP directives inserted by the programmer.

Finally, you may want to create a completely non-parallelized program so you can compare its execution time with the times of programs that are parallelized in various ways (such as the automatic method and the directed method). The following command does this:

kf90 -fkapargs='-noconc -directives=ak' -noomp myprog.f90

The -noconc switch prevents automatic parallelization of DO loops and the absence of p from the string following -directives= prevents Compaq KAP from responding to any parallel directive statements. The -noomp switch prevents the Fortran 90 compiler from responding to any parallel directive statements in the transformed source file it receives.

3.5 Compiling a Program for Parallel Execution Using kapf90

Note

Normally, you use the kf90 command with the -conc switch to create an optimized and parallelized executable file. Compaq recommends this command because it sets the compiler and linker switches correctly.

To view these switches, include the -v switch with the kf90 command.

To compile a program for parallel execution using the kapf90 command on Tru64 UNIX, issue the following commands:


   kapf90 -conc -cmp=myprog_mp.f90 myprog.f90 
 
   f90 myprog_mp.f90 -fast -tune host -automatic \
       -omp -pthread 

The kapf90 command preprocesses myprog.f90 to produce a new source file, myprog_mp.f90, which contains OpenMP directives inserted by Compaq KAP for loops Compaq KAP has selected for automatic parallelization. The file, myprog_mp.f90 is then processed by the compiler and linker to produce a parallelized executable, a.out. Further explanation of the switches used follows:

3.6 Running a Parallelized Program

To run a program parallelized with OpenMP directives (-psyntax=openmp), you must set the following environment variables:


OMP_SCHEDULE               (static,dynamic,guided,runtime) 
OMP_DYNAMIC                (true,false) default is false. 
OMP_NESTED                 (true,false) default is false. 
OMP_NUM_THREADS            (number) default value is the number of 
                           processors on the current system. 

For further information on environment variables read by the Fortran 90 compiler see the DIGITAL Fortran 90 User's Guide.

To run a program parallelized using the the Compaq KAP parallel library (-psyntax=kap), you must set the following environment variables:

Note

A program optimized to run with multiple threads may run more slowly as a single-thread process. There are two ways you can compile a program as a single-threaded process:
  • kf90 -fkapargs='-conc' myprog.f90
    setenv PARALLEL 1
  • kf90 -fkapargs='-noconc' myprog.f90


Either way may result in a performance increase or decrease of as much as 30% depending on your program.

3.7 Correcting KAP Parallel Processing Problems

The following are some problems you may encounter when using KAP and possible fixes and workarounds.

If you build your program on DIGITAL UNIX Versions 4.0 and above but run the executable with an earlier version of the parallel library, the following error message results:


DECthreads bugcheck (version X3.13-408P1), terminating execution. 
vpInit:  Unexpected error initializing kernel threads: 0x4 
Abort process (core dumped) 

Use the what command to determine the version of the parallel library, as follows:


what /usr/lib/libkmp_osfp10.a 
 
     Version BL30.2_posix10 

3.8 Parallel Programming Tips

3.9 Parallel Processing Options

Section 5.3 describes the command line options that control Compaq KAP parallel processing. They are:

3.10 Automatic Parallel Processing Directives

You can guide Compaq KAP automatic parallelization (-conc) using the directives described in Section 6.4.


Chapter 4
KAP and Fortran 90 Constructs

The following sections describe how KAP treats various Fortran 90 constructs.

4.1 Arrays


Compaq KAP Fortran/OpenMP preprocesses arrays by transforming them into Fortran 77 DO loop syntax. If you want to use KAP loop directives and assertions with arrays, you must enclose the arrays in directive blocks by using the KAP directives !*$* beginblock and !*$* endblock. See Section 6.3.2 for an explanation of these directives and a code example showing their usage with arrays.

4.2 MODULE Variables


Each MODULE becomes a Fortran 77 BLOCK DATA whose name is the same as the MODULE.

All MODULE variables not declared as part of a COMMON block will be placed in a COMMON block with the name of the MODULE. Any program unit that has a USE of that MODULE will have all relevant COMMON declarations inserted.

MODULE variables have the MODULE name prepended with an intervening punctuation character.

4.3 MODULE Parameters


Named constants also have the MODULE name prepended. PARAMETER statements for each MODULE constant are placed in any program unit that has a USE of the MODULE.

4.4 MODULE Procedures


Procedures immediately internal to a MODULE will have names modified by prepending the MODULE name with an intervening punctuation character. Any program unit that USEs a MODULE will have declarations for USEd MODULE procedures inserted.

4.5 Generic Fortran 90 Interfaces


In references to procedures that have interfaces, named parameters are replaced with positional parameters and missing prameters are filled in with unused temporaries. For each optional parameter, a LOGICAL parameter is appended to the end and is set to TRUE or FALSE depending on whether the parameter is present or missing.

Generic references to procedures are replaced with the specific equivalents.

Operator interfaces are replaced with procedure calls. If temporaries are necessary, they are created.

Assignment interfaces are replaced with procedure calls.

4.6 Internal Procedures


Internal procedures are converted to external form. The name of the host procedure is prepended to the internal procedure separated by a punctuation character. The local variables of the containing procedure are made addressable in the nested routines in the form of fields of a structure being imported by reference.

4.7 Host-Associated Variables


Local variables to a host procedure are placed in a structure, and a variable of that structure type is placed in the host procedure.

A COMMON block with a pointer to that structure type is created, and references to host-associated variables in internal procedures are changed to references to the pointed-to structure. On entry to the host procedure, the old value in the COMMON block pointer is saved to a local variable, and the COMMON pointer is pointed to the local structure in the procedure. On procedure exit, the old value of the pointer is restored.

SAVEd local variables are just placed in the same COMMON block that contains the pointer to the automatic local data.

4.8 Derived Types


Fortran 90 derived type declarations and their associated variable declarations are converted into equivalent DIGITAL Fortran 77 structures and records. The corresponding derived type references are converted into Compaq Fortran substructure references. Field names are not modified.

Uses of derived types in I/O lists are expanded into sequences of references to their components.

DIGITAL Fortran 77 aggregate assignments are used as needed.

4.9 Deferred Shape Objects


Fortran 90 array inquiry references, assumed shape objects, deferred shape objects, and Fortran 90 pointers are converted into descriptors. One component of these descriptors indirectly locates the contents of such an array.

The arrangement of the array elements is described by the other components of the descriptor. When an assumed shape array or an array found with a pointer is subscripted, the subscript calculation is rewritten as a function of the descriptor. This transform is needed to treat the non-default strides that arise in examples such as the following:


subroutine a( b ) 
integer, dimension(:.:) :: b 
 
 ... 
 
end subroutine a 
 
 ... 
 
 
integer d(10,20) 
call a( d(:10:2,20:10:-1) 
 
 ... 

For array pointer objects:


integer, pointer :: b(:,:) 
integer d(20,30) 
 
 ... 
 
b => d(20:1:-1,30:1:-1) 
 
 ... 
 

In each case an assignment to b must affect elements of d that are not found in traditional Fortran 77 element order.


Previous Next Contents Index