How could I analyze the memory peak and MCU MIPS when using Fixed Virtual Platform hardware in ARM DS such as Cortex-M7 to inference a machine learning output?
Like Streamline or Graphic Analyzer, etc.
There is a bare-metal mode for Streamline that you could use:https://developer.arm.com/documentation/101815/0709/Profiling-with-the-bare-metal-agent
Alternatively, simply initialize the processor CYCCNT register before the function of interest and re-read afterwards, as per the below code example. Note that cycle count is approximate with Arm fast Models (which the FVP is built from), and so do not rely on the model for exact results.
#define CM_DEMCR (*((volatile uint32_t*)0xE000EDFC))
#define CM_TRCENA_BIT (1UL<<24)
#define CM_DWT_CONTROL (*((volatile uint32_t*)0xE0001000))
#define CM_DWT_CYCCNTENA_BIT (1UL<<0)
#define CM_DWT_CYCCNT (*((volatile uint32_t*)0xE0001004))
CM_DEMCR |= CM_TRCENA_BIT;
CM_DWT_CONTROL |= CM_DWT_CYCCNTENA_BIT;
CM_DWT_CYCCNT = 0;
CM_DWT_CONTROL &= ~CM_DWT_CYCCNTENA_BIT;