KAP for DEC Fortran for Digital UNIX
User Guide

Order Number: AA-PXEKG-TE

December 1996

This document provides information about how to run and use KAP for DEC Fortran on Digital UNIX systems.

Revision /Update Information This is a revised document.
Operating System and Version: Digital UNIX, Versions 3.2 and 4.0b
Software Version: KAP for DEC Fortran, Version 3.1

Digital Equipment Corporation
Maynard, Massachusetts


First Printing, March 1993
Revised, December 1996

Possession, use, or copying of the software described in this publication is authorized only pursuant to a valid written license from Digital or an authorized sublicensor.

Digital Equipment Corporation makes no representations that the use of its products in the manner described in this publication will not infringe on existing or future patent rights, nor do the descriptions contained in this publication imply the granting of licenses to make, use, or sell equipment or software in accordance with the description. © Digital Equipment Corporation 1993, 1996. All rights reserved. © Kuck & Associates, Inc. 1993, 1996. All rights reserved.

The following are trademarks of Digital Equipment Corporation: AlphaGeneration, DECthreads, Digital, VAX DOCUMENT, and the DIGITAL logo.

KAP is a trademark of Kuck & Associates, Inc.

UNIX is a registered trademark in the United States and other countries licensed exclusively through X/Open Company Ltd.

All other trademarks and registered trademarks are the property of their respective holders.

This document is available on CD-ROM.

This document was produced using SDMLWEB.

Contents

Preface

1 Overview

2 How to Run KAP

2.1 General KAP Information

2.2 Installing KAP

2.3 Compiling a Program Using the kf77 Driver

2.4 Compiling a Program Containing C Preprocessor Directives Using kf77

2.5 Optimized Programs

2.6 KAP Command Switches Determined by Compiler Switches

2.7 Compiling a Program Using kapf

2.8 Compiling a Program Containing C Preprocessor Directives Using kapf

2.9 Using KAP Syntax

2.10 Using File Naming Conventions

2.11 Introducing the Five Minute KAP Guide

2.11.1 Optimizing Small Programs with KAP

2.11.2 Optimizing Large Programs with KAP

2.12 Improving and Customizing KAP Performance

2.13 General Optimization Tips

2.14 Using Additional Performance Improvement Techniques

2.15 Correcting KAP Problems

3 KAP Parallel Processing

3.1 Automatic Parallelization Using the kf77 Driver

3.1.1 Parallel Processing Options

3.2 Directed Parallelization Using the kf77 Driver and PCF Directives

3.2.1 PCF Directive Syntax and Lexical Rules

3.2.2 PARALLEL REGION Directive

3.2.3 PARALLEL DO Directive

3.2.4 DO Loop Example with PCF Directives

3.2.5 Program Example with PCF Directives

3.2.6 CRITICAL SECTION Directive

3.2.7 ONE PROCESSOR SECTION Directive

3.2.8 Comparison of KAP PCF and Cray Autotasking Directives

3.3 Combined Automatic and Directed Parallelization Using the kf77 Driver

3.4 Compiling a Program for Parallel Execution Using kapf

3.5 Building Applications with the DECthreads Archive Library on Digital UNIX Versions 4.0 and Higher

3.6 Running a Parallel Program

3.7 Parallel Run-Time Support Library Routines

3.8 Correcting KAP Parallel Processing Problems

3.9 Parallel Programming Tips

4 Command Switches

4.1 General Optimization Switches

4.1.1 -interchange, -nointerchange, (-interchange)

4.1.2 -namepartitioning, -namepart, -nnamepart, (-nonamepartitioning)

4.1.3 -optimize, -o, (-optimize=5)

4.1.4 -roundoff, -r, (-roundoff=3)

4.1.5 -scalaropt, -so, (-scalaropt=3)

4.1.6 -skip, -sk, -nsk, (-noskip)

4.1.7 -tune, -tune, (-tune=host)

4.2 Parallel Processing Switches

4.2.1 -concurrentize, -conc, -noconc, (-noconcurrentize)

4.2.2 -minconcurrent, -mc, (-minconcurrent=1000)

4.2.3 -parallelio, -nopio, -pio, (-noparallelio)

4.3 Fortran Dialect Switches

4.3.1 -assume, -a, (-assume=cel), -noassume, -na

4.3.2 -case, (-nocase), -ncase

4.3.3 -datasave, -ds, (-datasave), -nodatasave, -nds

4.3.4 -dlines, -dl, (-nodlines), -ndl

4.3.5 -escape, (-escape)

4.3.6 -integer, -int, (-integer=4)

4.3.7 -intlog, -intl, (-intlog), -nintl

4.3.8 -logical, -log, (-logical=4)

4.3.9 -natural, -nat, -nonatural

4.3.10 -onetrip, 1, (-noonetrip), -n1

4.3.11 -real, -rl, (-real=4)

4.3.12 -recursion, -rc, (-norecursion), -nrc

4.3.13 -save, -sv, (-save=manual_adjust)

4.3.14 -scan, (-scan=72)

4.3.15 -syntax, -sy, (off)

4.3.16 -type, -ty, (-notype), -nty

4.4 Inlining and Interprocedural Analysis Switches

4.4.1 -inline, -inl, (off) -noinline, -ninl, -ipa, -ipa, (off), -noipa, -nipa

4.4.2 -inline_and_copy, -inlc, (off)

4.4.3 -inline_create, -incr, (off), -ipa_create, -ipacr, (off)

4.4.4 -inline_depth, -ind, (-inline_depth=2), -ipa_depth, -ipad, (-ipa_depth=2)

4.4.5 -inline_from_files, -inff, (current source file)

4.4.6 -ipa_from_files, -ipaff, (current source file)

4.4.7 -inline_from_libraries, -infl, (off)

4.4.8 -ipa_from_libraries, -ipafl, (off)

4.4.9 -inline_looplevel, -inll, (-inline_looplevel=2), -ipa_looplevel, -ipall, (-ipa_looplevel=2)

4.4.10 -inline_manual, -inm, (off), -ipa_manual, -ipam, (off)

4.4.11 -inline_optimize, (-inline_optimize=0), -ipa_optimize, (-ipa_optimize=0)

4.5 Advanced Optimization Control

4.5.1 -aggressive, -ag, (-noaggressive), -nag

4.5.2 -arclimit, -arclm, -noarclimit, (-arclimit=5000)

4.5.3 -cacheline, -chl, (-cacheline=32,32)

4.5.4 -cache_prefetch_line_count, -cplc, (-cplc=0)

4.5.5 -cachesize, -chs, (-cachesize=8,0)

4.5.6 -dpregisters, -dpr, (-dpregisters=32)

4.5.7 -each_invariant_if_growth, -eiifg, (-eiifg=20)

4.5.8 -fpregisters, -fpr, (-fpregisters=32)

4.5.9 -fuse, -nfuse, (-nofuse)

4.5.10 -fuselevel, (-fuselevel=0)

4.5.11 -heaplimit, -heap, (-heaplimit=116)

4.5.12 -hoist_loop_invariants, -hli, (-hoist_loop_invariants=1)

4.5.13 -interleave, -intl, (-interleave)

4.5.14 -library_calls, -lc, (off)

4.5.15 -limit, -lm, (-limit=20000)

4.5.16 -machine, -ma, -noma, (-machine=s)

4.5.17 -max_invariant_if_growth, -miifg, (-miifg=500)

4.5.18 -routine, -rt, -nrt, (-noroutine)

4.5.19 -setassociativity, -sasc, (-setassociativity=1)

4.5.20 -srlcd, -nsrlcd, (-nosrlcd)

4.5.21 -unroll, -ur, (unroll=4), -unroll2, -ur2, (-unroll2=160), -unroll3, -ur3, (-unroll3=20)

4.6 Directive Recognition Switches

4.6.1 -directives, -dr, (-directives=akpv), -nodirectives, -ndr

4.6.2 -ignoreoptions, -ig, (-noignoreoptions), -nig

4.7 Input-Output Switches

4.7.1 -cmp, (<file>.cmp.f), (<file>.cmp), -nocmp, -ncmp

4.7.2 -fortran, -f, (<file>.cmp.f), (<file>.cmp), -nofortran, -nf

4.7.3 -list, -l, (<file>.out), -nolist, -nl

4.8 Listing Switches

4.8.1 -cmpoptions, -cp, (-cmpoptions=n), -nocmpoptions, -ncp

4.8.2 -lines, -ln, (-lines=55)

4.8.3 -listingwidth, -lw, (-listingwidth=132)

4.8.4 -listoptions, -lo, (-listoptions=o)

4.8.5 -suppress, -su, (off)

4.9 C*$* options

5 Directives

5.1 Directive Usage and Syntax

5.2 KAP Directives

5.3 General Optimization Directives

5.3.1 C*$* arclimit (0-5000)

5.3.2 C*$* beginblock <directive block> C*$* endblock

5.3.3 C*$* each_invariant_if_growth (0-100)

5.3.4 C*$* limit (> 0)

5.3.5 C*$* max_invariant_if_growth (0-1000)

5.3.6 C*$* optimize (0-5)

5.3.7 C*$* roundoff (0-3)

5.3.8 C*$* scalar optimize (0-3 )

5.3.9 C*$* unroll( <#it>[,<weight>])

5.4 Parallel Processing Directives

5.4.1 C*$* [no]concurrentize

5.4.2 C*$* minconcurrent (0-999999)

5.5 Inlining and IPA

5.5.1 C*$* [no]inline [here|routine|global] [(name [,name...])]

5.5.2 C*$* [no]ipa [here|routine|global] [(name [,name...])]

5.6 Assertions

5.6.1 C*$* [no]assertions

5.7 Memory Management

5.7.1 C*$* padding (var-list)

5.7.2 C*$* storage order (var-list)

6 Assertions

6.1 KAP Assertions

6.2 Descriptions

6.2.1 C*$* assert [no]argument aliasing

6.2.2 C*$* assert [no]bounds violations

6.2.3 C*$* assert [no]equivalence hazard

6.2.4 C*$* assert [no]interchange

6.2.5 C*$* assert [no]last value needed

6.2.6 C*$* assert permutation

6.2.7 C*$* assert no recurrence

6.2.8 C*$* assert relation ( <name> .XX. <variable/constant>)

6.2.9 C*$* assert no sync

6.2.10 C*$* assert [no] temporaries for constant arguments

6.3 Parallel Processing Assertions

6.3.1 C*$* assert concurrent call

6.3.2 C*$* assert do (concurrent)

6.3.3 C*$* assert do (concurrent call)

6.3.4 C*$* assert do (serial)

6.3.5 C*$* assert do prefer (concurrent)

6.3.6 C*$* assert do prefer (serial)

7 Inlining and IPA

7.1 Inlining and IPA Command Switches

7.1.1 inline_from/ipa_from Switches

7.1.2 Library Creation

7.1.3 Naming Specific Routines

7.1.4 DO Loop Level

7.1.5 Recursive Inlining

7.1.6 Manual Control

7.2 Inlining and IPA Directives

7.3 Listing File Support

7.3.1 -listoptions=c

7.4 Inlining/IPA Examples

7.4.1 Inlining Example - Same Source File

7.4.2 Inlining Example with a Library

7.4.3 IPA Example

7.4.4 Recursive Inlining Examples

7.4.5 Manual Inlining Example

7.4.6 Notes on Inlining and IPA

7.5 Conditions Inhibiting Inlining/IPA

8 Transformations

8.1 Memory Management

8.1.1 Command Switches

8.1.2 Memory Management Tactics

8.2 Serial Optimizations

8.2.1 Dead-Code Elimination

8.2.2 Induction Variable Recognition

8.2.3 Global Forward Substitution

8.2.4 Loop Peeling

8.2.5 Lifetime Analysis

8.2.6 Invariant-IF Restructuring

8.2.7 Reciprocal Substitution

8.3 Scalar (Dusty-Deck) IF Transformations

8.3.1 IF to Block IF

8.3.2 IF to DO Loop

8.3.3 Semantic IF Merging

8.3.4 Zero-Trip IF Removal

8.4 Loop Unrolling

8.5 Loop Rerolling

9 KAP Listing File

9.1 Listing Switches

9.1.1 Original Program Listing (O)

9.1.2 Calling Tree (C)

9.1.3 KAP Switches (K)

9.1.4 Loop Table (L)

9.1.5 Name (N)

9.1.6 Compilation Performance Statistics (P)

9.1.7 Summary Table (S)

9.1.8 Transformed Program Listing (T)

9.2 Listing Information

9.2.1 Line Numbers

9.2.2 DO Loop Markings

9.2.3 INCLUDE File Markings

9.2.4 Footnotes

9.2.5 Syntax Error/Warning Messages

9.2.6 Questions Generated by KAP

9.2.7 Action Summary

9.3 Loop Table Messages

9.4 KAP Listing Messages

A Data Dependence Analysis

A.1 Data Dependence Definitions

A.2 Varieties of Data Dependence

A.3 Input and Output Sets

A.4 Data Dependence Relations

A.5 Data Dependence Direction Vectors

A.6 Loop-Carried Dependence

A.7 Data Dependence Examples

B Listing File Messages

B.1 Classes of Messages

B.2 Messages

B.2.1 Data Dependence (DD)

B.2.2 Error (E)

B.2.3 Extension (EX)

B.2.4 Inlining/IPA (INL)

B.2.5 Informational (INF)

B.2.6 Inserted (I)

B.2.7 Loop Reordering (LR)

B.2.8 Warning (MIS)

B.2.9 Option Error (OW)

B.2.10 Not Optimized (NO)

B.2.11 Output Translation (OT)

B.2.12 Output Trans Fails (OTF)

B.2.13 Program Too Large (NO)

B.2.14 Question (Q)

B.2.15 Scalar Optimization (SO)

B.2.16 Standardized (STD)

B.2.17 Translator Error (TE)

B.2.18 Vector Enhanced (VE)

B.2.19 Warning (W)

C KAP and Incorrect Programs

Index

Tables

2-1 User Actions for Specific Goals

3-1 Comparison of KAP PCF and Cray Autotasking Directives

4-1 Command-Line Switches

5-1 KAP Directives

6-1 KAP Assertions


(no previous page) | Next Page | Contents | Index |
Command Line Switches