For Fortran and F90 debugging is - like all languages - inevitable. We look at debugging tips for Fortran and F90 developers to show why and how to use a debugger for some typical bugs.
The F90 and Fortran write (or print) statement for debugging is wired into the brain for many developers - but it just doesn’t do the job. Using write or print is iterative.
This is time consuming - it's hard to understand why it still gets used so often.
Using a debugger finding the line of a Fortran or F90 crash is instantaneous. All debuggers save time, but my favorite debugger is Arm DDT of course.
The debugger will run until the program crashes. At that point - you see the line of the crash, and the source code alongside it.
With a debugger, the variables relevant to the current location are visible when the process crashes. You can look at the calling frames in the stack too. That lets you answer questions like: why did the program call this function, why is it here and what are the variables for each stack frame. DDT is a graphical debugger - which means it has a graphical interface and that makes it much easier to see all that information than a command line debugger.
In contrast, the write statement - having decided where you need to put it - well, you’ve a lot more work to do if you want to know why it crashed.
We all have a horror story about a legacy code. The source file as long as a novel. The code with no comments.
The one true way to know what code is doing is to watch it as it progresses.
Debuggers let you step line by line through interesting bits of the Fortran or F90, step into a function or subroutine, or just set a breakpoint and let the code run until it reaches a function or line that you care about. You now know for sure how your program is getting executed.
And whilst you’re doing that - you will still be seeing the variables that explain why things are happening - scalar integers and floating point, or large arrays - and you can stop as soon as something untoward happens. Again, not something your write statement is set to do.
Inconsistent results and intermittent crashes often arise from accessing memory locations that either were not initialized, or worse, are fatal to touch.
When the i’th element is a calculated from the i-1th or the i+1th element - be careful with the first and/or last iteration. Do not try to access elements outside of the array bounds.
Take a typical Fortran stencil code, or more precisely some F90 here. The code looks sensible but it's flawed.
foo.f90 allocate(b(1000,1000)) ... ... do i = 1, 1000 do j = 1, 1000 ... a(j, i) = ( b(j - 1, i) + b(j + 1, i) + b(j, i - 1) + b(j, i + 1) + b(j, i) ) * z / 5
It is reading both above and below the end of the allocation. That has bad consequences.
Arm DDT shows the size of an array - helpful for knowing which indexes are and are not in range. More powerfully, DDT automatically detects these errors for allocatable arrays - both reading and writing. It is faster than typical compiler implemented bounds protection - all that’s needed is to tick a box to enable memory debugging in the DDT user interface.
This video shows us solving this kind of problem using DDT.
Ever had a code that only works when using Fortran compiler A and fails with compiler B? Or that crashes when you add (or remove) something completely harmless like a simple write statement?
Often it a bug that becomes visible because two compilers place variables in a different order in memory: the bug exists already, the compiler does not create it. We saw reading the wrong array indexes had consequences - but worse things happen when writing to elements beyond the end of arrays - because they can overwrite other good values. As two compilers can choose different orders and layouts to memory - one compilation may create a layout where a variable is particularly vulnerable to a stray write, whereas another compiler's compilation may be lucky and unaffected.
You can use DDT’s memory debugging if working with allocatable arrays to protect against stray writes. However, for the more general case, debuggers have neat support for “hardware watchpoints”. These let you track when a change happens to a given memory location, instantly. This uses a hardware feature present in most modern processors that allows a small handful of memory locations to be watched. On change, the processor instantly alerts the operating system.
To add a watchpoint in DDT, just select a variable or memory location and righ click to select "Add Watchpoint".
The magic then happens. The process runs without interruption until the change.
Here, we can see that c(1,10) is about to be changed - at line 80 of serial.f90 - and we even see the old and new values.
Debugging is rarely fun - so it's important to take advantage of all the magic tools at your disposal! We've seen four good reasons to use a debugger to do things that cannot be done without one: discovering the exact crash location and context, inspection of actual executed code paths, detection of stray memory accesses and detection of when variables change.
[CTAToken URL = "https://www.arm.com/products/development-tools/hpc-tools/cross-platform/forge/ddt" target="_blank" text="Read more about Arm DDT " class ="green"]