Advanced Memory Debugger and Memory Leak Detection for C++, C and F90 Applications

February 11, 2016

7 minute read time.

Advanced Memory Debugger and Memory Leak tool for Linux C++, C and F90

The memory debugger in Arm DDT assists in fixing a number of common memory usage errors with C, C++ and Fortran codes on Linux. The mode extends massively beyond what can be observed with command line debuggers or the print statement alone.

Memory debugging is enabled by ticking a checkbox in DDT or adding the "--mem-debug" flag to its command line (an additional linking step is required for Cray XC or XK platforms).

The questions it can answer and problems it can solve include:

How much memory am I using?
Which parts of my code are allocating the most memory?
Are there memory leaks - and where am I failing to deallocate?
Is a pointer being used after it is deallocated .. or after it has been re-used - and crashing my program?
For a given pointer, is it still valid, where was it allocated and how large is the block of memory?
Is my program deallocating or freeing invalid pointers?
Am I reading or writing beyond the end of an allocation and overwriting memory? If so, where?

Answering these questions solves many unexplained crashes. Ensuring code is clear from the types of issue listed also improves software quality.

The Heap

The region of memory that the memory debugging mode helps with is known as the heap. The heap is the area managed by the malloc, free and similar functions in C, the new and delete operator in C++ and the allocate and deallocate primitives in F90 and later Fortran derivatives.

DDT intercepts these functions to provide error detection, to record information and to measure how much memory is being used.

The level of checks is determined by a settings level in the Memory Debugging settings dialog - from basic to full checks. Full mode can slow down codes that perform very large numbers of allocations whereas basic has a usually near-zero time cost.

Memory Usage

The total memory usage is an important number to watch. Allocating too much memory will result in the operating system killing your process.

The running total for a process is shown on the Tools / Overall Memory Stats menu item. When debugging more than one process, they (or the top N processes by use) are shown in this view.

Memory stats

This view is available whenever Memory Debugging is enabled: if the total creeps up, then this is an indication of a memory leak.

Arm Forge, the tool suite that includes DDT, also has memory profiling in its performance profiler, MAP. MAP graphically profiles the usage of memory as it varies over time - which helps to narrow down when and where memory usage is increasing.

Memory Leak Detection

If memory is being allocated but not deallocated then this eventually leads to memory exhaustion and abrupt termination.

Memory leaks are detected by using the Tools / Current Memory Usage menu item, which is available whenever Memory Debugging is enabled.

For each memory allocation, DDT records the stack trace and the requested allocation size. This allows it to know where in the code a allocations are happening, and how much is being used.

The Current Memory Usage dialog uses this information to plot the top calling locations in terms of bytes used. If a leak is happening, it will be apparent in this bar chart. Clicking through elements on the bar chart lists the pointers and selecting a pointer will then show the stack and the size of that allocation.

Memory usage

Custom Allocators and Class Constructors

Many codes often channel allocations through a small number of entry points.

For example, a C++ class's constructor would often be used to allocate memory. The most useful classes can be invoked throughout a program. In these cases it is more helpful to group allocations by the line of code that called the constructor, so that the calls to the constructor are not lumped together in one amorphous blob.

To group by calls to the constructor, a right click in a block in this bar chart will add that represented function as a "Custom Allocator". Calls to that function will then be grouped separately by call location.

Automating Leak Detection and Regression Testing

DDT has a non-interactive memory debugging mode which replicates the interactive information of the Current Memory Usage tool described above. This mode is often used during overnight tests or in continuous integration servers to measure memory usage and automatically ensure that leaks do not enter production code.

This mode creates a HTML file containing the memory allocations that remain after a process (or processes) terminates.

ddt --offline offline-log.html --mem-debug ... application.exe ....

This creates an annotated log file of the non-interactive debugging session which contains a leak report with the top leaks identified along with significant debugging events that are logged throughout the execution of the program.

Offline memory

Deallocating invalid pointers

Crashes can also occur when trying to deallocate an allocation that has previously been deallocated, or a bogus address - either not allocated, or part way through a legitimate range of heap memory.

This leads to either immediate termination, or heap corruption and crashing at some future point.

In DDT this kind of problem is prevented as invalid pointers will immediately trigger an error message, and stop the process exactly where the error occurs.

Dangling Pointers

Dangling pointers are pointers whose memory has been freed but which have not been set to null. It is often possible for subsequent code to keep on using the dangling pointer and to keep on getting valid-looking data – until suddenly that memory is repurposed for something else.

This leads to unpredictable behavior, silent corruption and program crashes.

To enable detection of dangling pointers, the memory debugging settings must be set to one level higher than Fast. The term "free-protect" appears in the "Enabled Checks" window.

This level of memory debugging is usually all you need to find dangling pointer problems. When a dangling pointer is reused, DDT will stop your program at the exact line of code that reuses it with an error like this one:

Dangling pointers

The debugger will also show which pointer is dangling and exactly where it was originally allocated. Right-click on any pointers or dynamically-allocated arrays and choose "View Pointer Details" from the menu:

Dangling

Arm DDT tells us at once that this pointer is dangling (it says it points to an allocation that has already been freed) and shows us the full stack of function calls that led to its allocation.

Checking Pointer Information

As described in the Dangling Pointers section above, the Pointer Details window provides a wealth of information about any pointer:

Whether the pointer is valid, not yet allocated or dangling
What size of allocation is or was made
The exact position and stack in the code that the pointer was allocated at – even if it is a dangling pointer that has since been freed
The exact position and stack in the code that the pointer was deallocated again, if it is indeed a dangling pointer

In particular, in the image above we can see that it was allocated at hello.c:88 and simply clicking on this jumps us to that location in the source code viewer right away.

It also tells us where this memory was deallocated.

Reading or writing Beyond Array Bounds or Allocation ends

Reading a value from either before or after the range of an array or other allocation is unpleasant.

Much of the time or through good fortune, it may go unnoticed - but it can also cause intermittent crashes, where whether a crash happens or does not depends on the tiniest of changes in runtime environment.

Reading a value means polluting a calculation, as the resulting value or code path now depends on something unreliable and uncertain.
Writing to such a location can cause uncertain behavior in other areas of the code that then re-use this now corrupted location.
Both reading and writing can cause a crash if the address is outside of the program's allocated pages (usually 4096 bytes but some systems use much larger pages).

DDT prevents these kinds of error - it works with the operating system to create a page after, or before, each allocation and makes it read and write protected. As soon as the the protected memory is touched for a read or write, the Linux O/S will notify the debugger. This is known as "Guard Pages" - or you may know them as "Red Zones".

For codes with a relatively small number of large allocations - like most scientific codes, and most F90 codes the number of pages used as guard pages is small.

C++ codes often use significantly many small allocations - and can exhaust process limits. For such codes DDT offers an alternative setting known as "fence checking" or "fence painting". This periodically validates a few bytes above and below an allocation to check for unexpected writes. Note that this mode only checks writes, it cannot detect erroneous reads - and hence Guard Pages mode is generally preferred where possible for a code.

Further Information

High Performance Computing (HPC) blog

Expanding Arm on Arm with the NVIDIA Grace CPU

Tim Thornton

In this blog post, we show how the Arm Neoverse V2-based NVIDIA Grace CPU can run Arm's most performance-critical workloads and allows Arm to operate a consistent environment in-cloud and on-prem.
- November 20, 2024
Arm Performance Libraries 24.10

Chris Goodyer

In this blog post, we review the improvements made to Arm Performance Libraries 24.10.
- November 11, 2024
Optimizing the Pardiso Sparse Linear Solver on Arm Architecture by Panua Technologies: A Performance Comparison with Intel MKL

David Lecomber

This blog outlines the strategies utilized to enhance Pardiso's performance by leveraging the Arm architecture and presents a comparative study with Intel MKL Pardiso.
- October 22, 2024

AI and ML blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded blog

Graphics, Gaming, and VR blog

High Performance Computing (HPC) blog

Infrastructure Solutions blog

Internet of Things (IoT) blog

Operating Systems blog

SoC Design and Simulation blog

Tools, Software and IDEs blog