ARMPL FFT memory leak: ASAN reports alloc-dealloc-mismatch on Linux

Hi all,

When running ASAN on an internal library that utilizes ARMPL's FFT functionality, the following issue was reported. Here's the relevant portion of the ASAN trace 

=================================================================
==933787==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new vs free) on 0xff9619457380
#0 0xff96614af0d8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
#1 0xff964acfb38c in std::array<sloejit::padded<std::vector<unsigned char, std::allocator<unsigned char> > >, 4ul>::~array() (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x817b38c)
#2 0xff964ad16dec in sloejit::block::iterate_input_output_set(sloejit::arch_traits const*, sloejit::function_options_t) (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x8196dec)
#3 0xff964ad21108 in sloejit::function::finalise(sloejit::stack_frame_info const*) (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x81a1108)
#4 0xff964ad0e000 in sloejit::function::emit_bin(std::vector<sloejit::reloc_info, std::allocator<sloejit::reloc_info> >*, std::vector<sloejit::note_info, std::allocator<sloejit::note_info> >*, sloejit::stack_frame_info const*) (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x818e000)
#5 0xff964abd7414 in arm::fft1d::wfta::print_common(arm::fft1d::wfta::kernel_registry_entry<void>*, std::__cxx11::list<arm::fft1d::wfta::expr, std::allocator<arm::fft1d::wfta::expr> >, arm::fft1d::wfta::options_t const&, long, arm::fft1d::wfta::io_pointers const&, std::vector<long, std::allocator<long> > const&, std::vector<long, std::allocator<long> > const&, char, arm::fft1d::wfta::direction_kind, arm::fft1d::wfta::rtype, arm::fft1d::wfta::rtype, arm::fft1d::wfta::rtype, arm::fft1d::wfta::order_kind, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, arm::fft1d::wfta::out_mods, arm::fft1d::wfta::in_mods, arm::fft1d::wfta::dist_types, std::optional<int>) (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x8057414)
#6 0xff964abdb230 in arm::fft1d::wfta::kernel_data arm::fft1d::wfta::print_algo<std::complex<float>, std::complex<float>, std::complex<float> >(arm::fft1d::wfta::kernel_registry_entry<void>*, std::__cxx11::list<arm::fft1d::wfta::expr, std::allocator<arm::fft1d::wfta::expr> >, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, arm::fft1d::wfta::io_pointers const&, std::vector<long, std::allocator<long> > const&, std::vector<long, std::allocator<long> > const&, char, arm::fft1d::wfta::direction_kind, arm::fft1d::wfta::order_kind, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, arm::fft1d::wfta::options_t const&, arm::fft1d::wfta::dist_types, std::optional<int>, arm::fft1d::wfta::out_mods, arm::fft1d::wfta::in_mods) (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x805b230)
#7 0xff964abbc88c in void arm::fft1d::wfta::kernel_printer<std::complex<float>, std::complex<float>, std::complex<float> >::print_algo<void (std::complex<float> const*, std::complex<float>*, long, long, long, long, long)>(arm::fft1d::wfta::kernel_registry_entry<void (std::complex<float> const*, std::complex<float>*, long, long, long, long, long)>*, char, arm::fft1d::wfta::order_kind, arm::fft1d::wfta::out_mods, arm::fft1d::wfta::in_mods) (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x803c88c)
#8 0xff964abbea18 in std::optional<arm::fft1d::kernel_data<std::complex<float>, std::complex<float>, void> > arm::fft1d::get_kernel_data<std::complex<float>, std::complex<float> >(long, long, long, long, long, arm::fft1d::direction, int, want_kernel_type) (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x803ea18)
#9 0xff964ab689a8 in std::optional<arm::fft1d::level_data_info> arm::fft1d::make_level_data<(arm::fft1d::level_type)2, std::complex<float>, std::complex<float> >(arm::fft1d::unique_ptr<arm::fft1d::level_data_base<std::complex<float>, std::complex<float> > >*, long, long, long, long, long, long, long, long, long, arm::fft1d::direction, double, double, bool, bool, bool, bool) [clone .constprop.0] (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x7fe89a8)
#10 0xff964ab6b9a0 in std::pair<bool, std::optional<arm::fft1d::composition<std::complex<float>, std::complex<float> > > > arm::fft1d::composite_init_from_factors<std::complex<float>, std::complex<float> >(long, long, long, long, long, long, arm::fft1d::direction, arm::fft1d::pod_vector<long, arm::fft1d::reallocator> const&, double, double, bool, bool) (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x7feb9a0)
#11 0xff964ab6bc70 in bool arm::fft1d::audition_perms<std::complex<float>, std::complex<float> >(arm::fft1d::pod_vector<long, arm::fft1d::reallocator>, long, std::complex<float> const*, std::complex<float>*, long, long, long, long, long, arm::fft1d::direction, double, double, arm::fft1d::composition<std::complex<float>, std::complex<float> >&, bool, bool) (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x7febc70)
#12 0xff964ab6c414 in std::pair<bool, arm::fft1d::composition<std::complex<float>, std::complex<float> > > arm::fft1d::composite_init<std::complex<float>, std::complex<float> >(long, long, long, long, long, long, arm::fft1d::direction, double, double, bool) (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x7fec414)
#13 0xff964ab3a190 in arm::fft1d::unique_ptr<arm::fft1d::fft_detail_plan> arm::fft1d::make_1d_plan<std::complex<float>, std::complex<float> >(long, std::complex<float> const*, std::complex<float>*, long, long, long, long, long, int, arm::fft1d::r2r_variant, bool, double, double) (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x7fba190)
#14 0xff964b88b3d4 in fftwf_plan_dft_1d (/opt/armpl/armpl_25.07.1_gcc/lib/libarmpl.so+0x8d0b3d4)
0xff9619457380 is located 0 bytes inside of 573-byte region [0xff9619457380,0xff96194575bd)

System info: Ubuntu 22.04, aarch-64, Linux kernel v6.8.0-83-generic

Tested on ARMPL versions: 24.10. 25.04, 25.07, 25.07.1

I couldn't find this reported before. Has anyone seen this? Let me know if you need more information.

Thanks,

Andrew

  • Hi Andrew,

    Many thanks for reporting this. We have received a report of a similar error from another user, and this may be related. To help us investigate it further, please could you provide some additional information if possible:

    1) In the ASAN trace, there should also be output describing where the relevant memory was allocated. It usually starts with something like allocated by thread T0 here. If you have this output, please could you provide it as well?

    2) Do you have a reproducer that demonstrates this behaviour? For example, is there a call to fftwf_plan_dft_1d() with certain parameters that causes this error to happen?

    Regards,

    Nick

  • Hi Nick, 

    Thanks for the response, and sorry for the delay, I was sick all weekend. Here is more information from the ASAN trace:

    0xff37f4c7c880 is located 0 bytes inside of 573-byte region [0xff37f4c7c880,0xff37f4c7cabd)
    allocated by thread T172 here:
        #0 0xff385bce0e0c in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
        #1 0xff3855df3a34 in std::vector<unsigned char, std::allocator<unsigned char> >::vector(std::vector<unsigned char, std::allocator<unsigned char> > const&) 
        #2 0xff383e6d6548 in sloejit::block::iterate_input_output_set(sloejit::arch_traits const*, sloejit::function_options_t) (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0x1316548)
        #3 0xff383e6de04c in sloejit::function::finalise(sloejit::stack_frame_info const*) [clone .part.0] (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0x131e04c)
        #4 0xff383e6e10e8 in sloejit::function::emit_bin(std::vector<sloejit::reloc_info, std::allocator<sloejit::reloc_info> >*, sloejit::stack_frame_info const*) (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0x13210e8)
        #5 0xff383e4eb5bc in armpl::wfta::print_common(armpl::wfta::kernel_registry_entry<void>*, std::__cxx11::list<armpl::wfta::expr, std::allocator<armpl::wfta::expr> >, armpl::wfta::target const&, long, armpl::wfta::io_pointers const&, std::vector<long, std::allocator<long> > const&, std::vector<long, std::allocator<long> > const&, char, armpl::wfta::direction_kind, armpl::wfta::rtype, armpl::wfta::rtype, armpl::wfta::rtype, armpl::wfta::order_kind, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, armpl::wfta::out_mods, armpl::wfta::in_mods, armpl::wfta::dist_types, std::optional<int>) (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0x112b5bc)
        #6 0xff383e4ef310 in armpl::wfta::kernel_data armpl::wfta::print_algo<std::complex<float>, std::complex<float>, std::complex<float> >(armpl::wfta::kernel_registry_entry<void>*, std::__cxx11::list<armpl::wfta::expr, std::allocator<armpl::wfta::expr> >, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, armpl::wfta::io_pointers const&, std::vector<long, std::allocator<long> > const&, std::vector<long, std::allocator<long> > const&, char, armpl::wfta::direction_kind, armpl::wfta::order_kind, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, armpl::wfta::options_t const&, armpl::wfta::dist_types, std::optional<int>, armpl::wfta::out_mods, armpl::wfta::in_mods) (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0x112f310)
        #7 0xff383e49a4ec in void armpl::wfta::kernel_printer<std::complex<float>, std::complex<float>, std::complex<float> >::print_algo<void (std::complex<float> const*, std::complex<float>*, long, long, long, long, long)>(armpl::wfta::kernel_registry_entry<void (std::complex<float> const*, std::complex<float>*, long, long, long, long, long)>*, char, armpl::wfta::order_kind, armpl::wfta::out_mods, armpl::wfta::in_mods) (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0x10da4ec)
        #8 0xff383e49cb4c in std::optional<armpl::fft::kernel_data<std::complex<float>, std::complex<float>, void> > armpl::fft::get_kernel_data<std::complex<float>, std::complex<float> >(long, long, long, arm::fft1d::direction, int, want_kernel_type) (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0x10dcb4c)
        #9 0xff383e24ef94 in std::optional<armpl::fft::level_data_info> armpl::fft::make_level_data<(armpl::fft::level_type)2, std::complex<float>, std::complex<float> >(std::unique_ptr<armpl::fft::level_data_base<std::complex<float>, std::complex<float> >, std::default_delete<armpl::fft::level_data_base<std::complex<float>, std::complex<float> > > >*, long, long, long, long, long, long, long, long, long, arm::fft1d::direction, double, double, bool, bool, bool, bool) [clone .constprop.0] (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0xe8ef94)
        #10 0xff383e2507ec in std::pair<bool, std::optional<armpl::fft::composition<std::complex<float>, std::complex<float> > > > composite_init_from_factors<std::complex<float>, std::complex<float> >(long, long, long, long, long, long, arm::fft1d::direction, std::vector<long, std::allocator<long> > const&, double, double, bool, bool) (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0xe907ec)
        #11 0xff383e2519dc in bool audition_perms<std::complex<float>, std::complex<float> >(std::vector<long, std::allocator<long> >, long, std::complex<float> const*, std::complex<float>*, long, long, long, long, long, arm::fft1d::direction, double, double, armpl::fft::composition<std::complex<float>, std::complex<float> >&, bool, bool) (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0xe919dc)
        #12 0xff383e252944 in std::pair<bool, armpl::fft::composition<std::complex<float>, std::complex<float> > > armpl::fft::composite_init<std::complex<float>, std::complex<float> >(long, long, long, long, long, long, arm::fft1d::direction, double, double, bool) (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0xe92944)
        #13 0xff383e35d604 in fftwf_plan_dft_1d (/opt/armpl/armpl_25.04_gcc/lib/libarmpl.so+0xf9d604)

    While I can't provide direct code to demonstrate this, here are the parameters we use. The call to `fftwf_plan_dft_1d()` appears to be normal. It takes an int as `nfft`, two `fftw_complex*` as the `in` and `out` parameters, `FFTW_FORWARD` as the sign, and `FFTW_ESTIMATE | FFTW_UNALIGNED` as the flags

    Thanks,

    Andrew

  • Hi Andrew,

    Many thanks -- that information has enabled us to reproduce your issue and we have confirmed that it is the same as one reported by another user. We plan to provide a fix in a future release of ArmPL.

    Regards,

    Nick

  • Thanks, Nick. Based on your previous release schedule, is it safe to assume that the next official release will be around mid-2026?

  • Hi Andrew,

    We are currently planning to make our next main release in January 2026.

    Regards,

    Nick