After you have used the KAP protocol for either small or large programs, you can find ways to fine-tune KAP to fit your application.
This section helps you discover which KAP command-line switches, directives, or assertions can be used to try to improve KAP performance for a particular application program. The following is a list of common goals and common program situations that KAP users often have and suggestions for possible improvements.
Remember that KAP is a tool to optimize Fortran code. Like any tool, it performs best when you are familiar with the details of how it works and are able to use its correctly and advantageously.
Although KAP default switch settings will achieve performance improvement, you can often achieve greater improvement if you understand and use alternate switch settings. Moreover, you can often insert directives or assertions to achieve improved performance.
See Table 2-1 for user actions and specific goals.
Goal | User Action |
---|---|
Have a more informative listing to help answer your questions. | Use -lo=otkl
or other listing switches under -listoptions
command-line switch. |
Recognize more reductions. | Increase -roundoff
switch setting. |
Answer a KAP generated question. | Use appropriate assertion. |
Eliminate unnecessary last-value assignment. | Use C*$* assert no last value needed
or -assume without the l switch; or try
-save=manual . |
Spend less time optimizing deeply nested loops. | Reduce
-limit and -arclimit or their
directives. |
Disable inner loop unrolling. | Use -unroll=1 or -
scalaropt < 2. |
Disable outer loop unrolling. | Use -roundoff < 3 or
-scalaropt < 3. |
Prevent a given loop from being optimized. | Use C*$*
assert do (serial) , C*$* assert do prefer (serial),
C*$* noconcurrentize, or C*$* optimize (0)
. (Remember to turn optimization back on after the serial
loop.) |
Disable some data dependence checking. | Use C*$* assert no recurrence
for one loop nest. |
Expand (inline) subroutine calls within DO loops. | Use -inline,
-inline_from_files, or -inline_create and
-inline_from_libraries . Or, if the goal is to
execute the subroutine body concurrently, try -ipa
or C*$* assert concurrent call . |
Inline more routines. | Increase -inline_
depth and
-inline_looplevel . (See also the C*$*
inline directive.) |
Turn off directives and assertions. | Use the
-nodirectives switch. |
Process a program that uses intentional array bounds violation. | Use C*$* assert bounds violations . |
Use STATIC storage. | Insert SAVE
statements or use -save=all . |