Compaq Fortran
Release Notes for Compaq Tru64 UNIX Systems


Previous Contents

1.7.2 Version 5.2 New Features

The following new features are now supported:

1.7.3 Version 5.2 Important Information

Some important information to note about this release:

1.7.4 Version 5.2 Corrections

From version V5.1-594-3882K to FT1 T5.2-682-4289P, the following corrections have been made:

From version FT1 T5.2-682-4289P to FT2 T5.2-695-428AU, the following corrections have been made: From version FT2 T5.2-695-428AU to V5.2-705-428BH, the following corrections have been made:

1.8 High Performance Fortran (HPF) Support in Version 5.2

Compaq Fortran (DIGITAL Fortran 90) Version 5.2 supports the entire High Performance Fortran (HPF) Version 2.0 specification with the following exceptions:

In addition, the compiler supports many HPF Version 2.0 approved extensions including:

1.8.1 Optimization

This section contains release notes relevant to increasing code performance. You should also refer to Chapter 7 of the DIGITAL High Performance Fortran 90 HPF and PSE Manual for more detail.

1.8.1.1 The -fast Compile-Time Option

To get optimal performance from the compiler, use the -fast option if possible.

Use of the -fast option is not permitted in certain cases, such as programs with zero-sized data objects or with very small nearest-neighbor arrays.

For More Information:

1.8.1.2 Non-Parallel Execution of Code

The following constructs are not handled in parallel:

If an expression contains a non-parallel construct, the entire statement containing the expression is executed in a nonparallel fashion. The use of such constructs can cause degradation of performance. Compaq recommends avoiding the use of constructs to which the above conditions apply in the computationally intensive kernel of a routine or program.

1.8.1.3 INDEPENDENT DO Loops Currently Parallelized

Not all INDEPENDENT DO loops are currently parallelized. It is important to use the -show hpf or -show hpf_indep compile-time option, which will give a message whenever a loop marked INDEPENDENT is not parallelized.

Currently, a nest of INDEPENDENT DO loops is parallelized whenever the following conditions are met:

When the entire loop nest is encapsulated in an ON HOME RESIDENT region, then only the first two restrictions apply.

For More Information:

1.8.1.4 Nearest-Neighbor Optimization

The following is a list of conditions that must be satisfied in an array assignment, FORALL statement, or INDEPENDENT DO loop in order to take advantage of the nearest-neighbor optimization:

Compile with the -show hpf or -show hpf_nearest switch to see which lines are treated as nearest-neighbor.

Nearest-neighbor communications are not profiled by the pprof profiler. See the section about the pprof Profile Analysis Tool in the Parallel Software Environment (PSE) Version 1.6 release notes.

For More Information:

1.8.1.5 Widths Given with the SHADOW Directive Agree with Automatically Generated Widths

When compiler-determined shadow widths don't agree with the widths given with the SHADOW directive, less efficient code will usually be generated.

To avoid this problem, create a version of your program without the SHADOW directive, and compile with the -show hpf or -show hpf_near option. The compiler will generate messages that include the sizes of the compiler-determined shadow widths. Make sure that any widths you specify with the SHADOW directive match the compiler-generated widths.

1.8.1.6 Using EOSHIFT Intrinsic for Nearest Neighbor Calculations

In the current compiler version, the compiler does not always recognize nearest-neighbor calculations coded using EOSHIFT. Also, EOSHIFT is sometimes converted into a series of statements, only some of which may be eligible for the nearest neighbor optimization.

To avoid these problems, Compaq recommends using CSHIFT or FORALL instead of EOSHIFT if these alternatives meet the needs of your program.

1.8.2 New Features

This section describes the new HPF features in this release of Compaq Fortran.

1.8.2.1 RANDOM_NUMBER Executes in Parallel

The RANDOM_NUMBER intrinsic subroutine now executes in parallel for mapped data. The result is a significant decrease in execution time.

1.8.2.2 Improved Performance of TRANSPOSE Intrinsic

The TRANSPOSE intrinsic will execute faster for most arrays that are mapped either * or BLOCK in all dimensions.

1.8.2.3 Improved Performance of DO Loops Marked as INDEPENDENT

Certain induction variables are now recognized as affine functions of the INDEPENDENT DO loop indices, thus meeting the requirements listed in Section 1.8.1.3. Now, the compiler can parallelize array references containing such variables as subscripts. An example is next.


!     Compiler now recognizes a loop as INDEPENDENT because it 
!        knows that variable k1 is k+1. 
      PROGRAM gauss 
      INTEGER, PARAMETER    :: n = 1024 
      REAL, DIMENSION (n,n) :: A 
      !HPF$ DISTRIBUTE A(*,CYCLIC) 
 
      DO k = 1, n-1 
         k1 = k+1 
         !HPF$ INDEPENDENT, NEW(i) 
         DO j = k1, n 
            DO i = k1, n 
               A(i,j) = A(i,j) - A(i,k) * A(k,j) 
            ENDDO 
         ENDDO 
      ENDDO 
      END PROGRAM gauss 

1.8.3 Corrections

This section lists problems in previous versions that have been fixed in this version.


Previous Next Contents