Order Number: AA-PXEKG-TE
December 1996
This document provides information about how to run and use KAP for DEC Fortran on Digital UNIX systems.
Revision /Update Information | This is a revised document. |
Operating System and Version: | Digital UNIX, Versions 3.2 and 4.0b |
Software Version: | KAP for DEC Fortran, Version 3.1 |
Digital Equipment Corporation
Maynard, Massachusetts
Possession, use, or copying of the software described in this publication is authorized only pursuant to a valid written license from Digital or an authorized sublicensor.
Digital Equipment Corporation makes no representations that the use of its products in the manner described in this publication will not infringe on existing or future patent rights, nor do the descriptions contained in this publication imply the granting of licenses to make, use, or sell equipment or software in accordance with the description. © Digital Equipment Corporation 1993, 1996. All rights reserved. © Kuck & Associates, Inc. 1993, 1996. All rights reserved.
The following are trademarks of Digital Equipment Corporation: AlphaGeneration, DECthreads, Digital, VAX DOCUMENT, and the DIGITAL logo.
KAP is a trademark of Kuck & Associates, Inc.
UNIX is a registered trademark in the United States and other countries licensed exclusively through X/Open Company Ltd.
All other trademarks and registered trademarks are the property of their respective holders.
This document is available on CD-ROM.
This document was produced using SDMLWEB.
1 Overview
2.2 Installing KAP
2.3 Compiling a Program Using the kf77 Driver
2.4 Compiling a Program Containing C Preprocessor Directives Using kf77
2.6 KAP Command Switches Determined by Compiler Switches
2.7 Compiling a Program Using kapf
2.8 Compiling a Program Containing C Preprocessor Directives Using kapf
2.9 Using KAP Syntax
2.10 Using File Naming Conventions
2.11 Introducing the Five Minute KAP Guide
2.11.1 Optimizing Small Programs with KAP
2.11.2 Optimizing Large Programs with KAP
2.12 Improving and Customizing KAP Performance
2.13 General Optimization Tips
2.14 Using Additional Performance Improvement Techniques
3.1 Automatic Parallelization Using the kf77 Driver
3.1.1 Parallel Processing Options
3.2 Directed Parallelization Using the kf77 Driver and PCF Directives
3.2.1 PCF Directive Syntax and Lexical Rules
3.2.2 PARALLEL REGION Directive
3.2.3 PARALLEL DO Directive
3.2.4 DO Loop Example with PCF Directives
3.2.5 Program Example with PCF Directives
3.2.6 CRITICAL SECTION Directive
3.2.7 ONE PROCESSOR SECTION Directive
3.2.8 Comparison of KAP PCF and Cray Autotasking Directives
3.3 Combined Automatic and Directed Parallelization Using the kf77 Driver
3.4 Compiling a Program for Parallel Execution Using kapf
3.5 Building Applications with the DECthreads Archive Library on Digital UNIX Versions 4.0 and Higher
3.6 Running a Parallel Program
3.7 Parallel Run-Time Support Library Routines
3.8 Correcting KAP Parallel Processing Problems
4.1 General Optimization Switches
4.1.1 -interchange, -nointerchange, (-interchange)
4.1.2 -namepartitioning, -namepart, -nnamepart, (-nonamepartitioning)
4.1.3 -optimize, -o, (-optimize=5)
4.1.4 -roundoff, -r, (-roundoff=3)
4.1.5 -scalaropt, -so, (-scalaropt=3)
4.1.6 -skip, -sk, -nsk, (-noskip)
4.1.7 -tune, -tune, (-tune=host)
4.2 Parallel Processing Switches
4.2.1 -concurrentize, -conc, -noconc, (-noconcurrentize)
4.2.2 -minconcurrent, -mc, (-minconcurrent=1000)
4.2.3 -parallelio, -nopio, -pio, (-noparallelio)
4.3.1 -assume, -a, (-assume=cel), -noassume, -na
4.3.2 -case, (-nocase), -ncase
4.3.3 -datasave, -ds, (-datasave), -nodatasave, -nds
4.3.4 -dlines, -dl, (-nodlines), -ndl
4.3.5 -escape, (-escape)
4.3.6 -integer, -int, (-integer=4)
4.3.7 -intlog, -intl, (-intlog), -nintl
4.3.8 -logical, -log, (-logical=4)
4.3.9 -natural, -nat, -nonatural
4.3.10 -onetrip, 1, (-noonetrip), -n1
4.3.11 -real, -rl, (-real=4)
4.3.12 -recursion, -rc, (-norecursion), -nrc
4.3.13 -save, -sv, (-save=manual_adjust)
4.3.14 -scan, (-scan=72)
4.3.15 -syntax, -sy, (off)
4.3.16 -type, -ty, (-notype), -nty
4.4 Inlining and Interprocedural Analysis Switches
4.4.1 -inline, -inl, (off) -noinline, -ninl, -ipa, -ipa, (off), -noipa, -nipa
4.4.2 -inline_and_copy, -inlc, (off)
4.4.3 -inline_create, -incr, (off), -ipa_create, -ipacr, (off)
4.4.4 -inline_depth, -ind, (-inline_depth=2), -ipa_depth, -ipad, (-ipa_depth=2)
4.4.5 -inline_from_files, -inff, (current source file)
4.4.6 -ipa_from_files, -ipaff, (current source file)
4.4.7 -inline_from_libraries, -infl, (off)
4.4.8 -ipa_from_libraries, -ipafl, (off)
4.4.9 -inline_looplevel, -inll, (-inline_looplevel=2), -ipa_looplevel, -ipall, (-ipa_looplevel=2)
4.4.10 -inline_manual, -inm, (off), -ipa_manual, -ipam, (off)
4.4.11 -inline_optimize, (-inline_optimize=0), -ipa_optimize, (-ipa_optimize=0)
4.5 Advanced Optimization Control
4.5.1 -aggressive, -ag, (-noaggressive), -nag
4.5.2 -arclimit, -arclm, -noarclimit, (-arclimit=5000)
4.5.3 -cacheline, -chl, (-cacheline=32,32)
4.5.4 -cache_prefetch_line_count, -cplc, (-cplc=0)
4.5.5 -cachesize, -chs, (-cachesize=8,0)
4.5.6 -dpregisters, -dpr, (-dpregisters=32)
4.5.7 -each_invariant_if_growth, -eiifg, (-eiifg=20)
4.5.8 -fpregisters, -fpr, (-fpregisters=32)
4.5.9 -fuse, -nfuse, (-nofuse)
4.5.10 -fuselevel, (-fuselevel=0)
4.5.11 -heaplimit, -heap, (-heaplimit=116)
4.5.12 -hoist_loop_invariants, -hli, (-hoist_loop_invariants=1)
4.5.13 -interleave, -intl, (-interleave)
4.5.14 -library_calls, -lc, (off)
4.5.15 -limit, -lm, (-limit=20000)
4.5.16 -machine, -ma, -noma, (-machine=s)
4.5.17 -max_invariant_if_growth, -miifg, (-miifg=500)
4.5.18 -routine, -rt, -nrt, (-noroutine)
4.5.19 -setassociativity, -sasc, (-setassociativity=1)
4.5.20 -srlcd, -nsrlcd, (-nosrlcd)
4.5.21 -unroll, -ur, (unroll=4), -unroll2, -ur2, (-unroll2=160), -unroll3, -ur3, (-unroll3=20)
4.6 Directive Recognition Switches
4.6.1 -directives, -dr, (-directives=akpv), -nodirectives, -ndr
4.6.2 -ignoreoptions, -ig, (-noignoreoptions), -nig
4.7.1 -cmp, (<file>.cmp.f), (<file>.cmp), -nocmp, -ncmp
4.7.2 -fortran, -f, (<file>.cmp.f), (<file>.cmp), -nofortran, -nf
4.7.3 -list, -l, (<file>.out), -nolist, -nl
4.8 Listing Switches
4.8.1 -cmpoptions, -cp, (-cmpoptions=n), -nocmpoptions, -ncp
4.8.2 -lines, -ln, (-lines=55)
4.8.3 -listingwidth, -lw, (-listingwidth=132)
4.8.4 -listoptions, -lo, (-listoptions=o)
4.8.5 -suppress, -su, (off)
4.9 C*$* options
5.1 Directive Usage and Syntax
5.2 KAP Directives
5.3 General Optimization Directives
5.3.1 C*$* arclimit (0-5000)
5.3.2 C*$* beginblock <directive block> C*$* endblock
5.3.3 C*$* each_invariant_if_growth (0-100)
5.3.4 C*$* limit (> 0)
5.3.5 C*$* max_invariant_if_growth (0-1000)
5.3.6 C*$* optimize (0-5)
5.3.7 C*$* roundoff (0-3)
5.3.8 C*$* scalar optimize (0-3 )
5.3.9 C*$* unroll( <#it>[,<weight>])
5.4 Parallel Processing Directives
5.4.1 C*$* [no]concurrentize
5.4.2 C*$* minconcurrent (0-999999)
5.5 Inlining and IPA
5.5.1 C*$* [no]inline [here|routine|global] [(name [,name...])]
5.5.2 C*$* [no]ipa [here|routine|global] [(name [,name...])]
5.6 Assertions
5.6.1 C*$* [no]assertions
5.7.1 C*$* padding (var-list)
5.7.2 C*$* storage order (var-list)
6.1 KAP Assertions
6.2 Descriptions
6.2.1 C*$* assert [no]argument aliasing
6.2.2 C*$* assert [no]bounds violations
6.2.3 C*$* assert [no]equivalence hazard
6.2.4 C*$* assert [no]interchange
6.2.5 C*$* assert [no]last value needed
6.2.6 C*$* assert permutation
6.2.7 C*$* assert no recurrence
6.2.8 C*$* assert relation ( <name> .XX. <variable/constant>)
6.2.9 C*$* assert no sync
6.2.10 C*$* assert [no] temporaries for constant arguments
6.3 Parallel Processing Assertions
6.3.1 C*$* assert concurrent call
6.3.2 C*$* assert do (concurrent)
6.3.3 C*$* assert do (concurrent call)
6.3.4 C*$* assert do (serial)
6.3.5 C*$* assert do prefer (concurrent)
6.3.6 C*$* assert do prefer (serial)
7.1 Inlining and IPA Command Switches
7.1.1 inline_from/ipa_from Switches
7.1.2 Library Creation
7.1.3 Naming Specific Routines
7.1.4 DO Loop Level
7.1.5 Recursive Inlining
7.1.6 Manual Control
7.2 Inlining and IPA Directives
7.3.1 -listoptions=c
7.4.1 Inlining Example - Same Source File
7.4.2 Inlining Example with a Library
7.4.3 IPA Example
7.4.4 Recursive Inlining Examples
7.4.5 Manual Inlining Example
7.4.6 Notes on Inlining and IPA
7.5 Conditions Inhibiting Inlining/IPA
8.1.1 Command Switches
8.1.2 Memory Management Tactics
8.2.1 Dead-Code Elimination
8.2.2 Induction Variable Recognition
8.2.3 Global Forward Substitution
8.2.4 Loop Peeling
8.2.5 Lifetime Analysis
8.2.6 Invariant-IF Restructuring
8.2.7 Reciprocal Substitution
8.3 Scalar (Dusty-Deck) IF Transformations
8.3.1 IF to Block IF
8.3.2 IF to DO Loop
8.3.3 Semantic IF Merging
8.3.4 Zero-Trip IF Removal
8.4 Loop Unrolling
8.5 Loop Rerolling
9.1 Listing Switches
9.1.1 Original Program Listing (O)
9.1.2 Calling Tree (C)
9.1.3 KAP Switches (K)
9.1.4 Loop Table (L)
9.1.5 Name (N)
9.1.6 Compilation Performance Statistics (P)
9.1.7 Summary Table (S)
9.1.8 Transformed Program Listing (T)
9.2.1 Line Numbers
9.2.2 DO Loop Markings
9.2.3 INCLUDE File Markings
9.2.4 Footnotes
9.2.5 Syntax Error/Warning Messages
9.2.6 Questions Generated by KAP
9.2.7 Action Summary
A.1 Data Dependence Definitions
A.2 Varieties of Data Dependence
A.5 Data Dependence Direction Vectors
B.2 Messages
B.2.1 Data Dependence (DD)
B.2.2 Error (E)
B.2.3 Extension (EX)
B.2.4 Inlining/IPA (INL)
B.2.5 Informational (INF)
B.2.6 Inserted (I)
B.2.7 Loop Reordering (LR)
B.2.8 Warning (MIS)
B.2.9 Option Error (OW)
B.2.10 Not Optimized (NO)
B.2.11 Output Translation (OT)
B.2.12 Output Trans Fails (OTF)
B.2.13 Program Too Large (NO)
B.2.14 Question (Q)
B.2.15 Scalar Optimization (SO)
B.2.16 Standardized (STD)
B.2.17 Translator Error (TE)
B.2.18 Vector Enhanced (VE)
B.2.19 Warning (W)
Tables
2-1 User Actions for Specific Goals
3-1 Comparison of KAP PCF and Cray Autotasking Directives
5-1 KAP Directives
6-1 KAP Assertions