This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Problem with Compiling Pure C with Larger Arrays in Keil (goal: compiling CNN)


* CONTEXT:   Hi! I wrote a CNN inference model (MobileNetV3-Small) in bare metal C and verified its correctness (outputs match PyTorch). I did this on my local machine in Visual Studio Code.
I am trying to simulate this C program on ARM Cortex M4 with FPU, and I am using Keil uVision to compile my C program. I am doing this as part of my thesis project where I will be characterizing the performance/energy before and after adding a specialized custom hardware unit.

I am using my group's RTL simulation infrastructure to run the compiled program (obtained from Keil) on the ARM Cortex M4 core/peripherals and visualize the cycle by cycle info on a waveform viewer for validation and debugging. 

* PROBLEM: I am still learning Keil and the proper way to write code for embedded systems, and I've been having a lot of trouble with running even a simple convolution with larger inputs. It seems that whenever the input arrays get even a little large, the SRAM data inputs (as can be seen on the waveform) become undefined/stop updating and the code runs into memory faults. When the program gets stuck in mem faults instructions, it can also be seen in the disassembly. Below, I show the problem with a simple example (simple array eg.) as well as a snippet of the actual code I am aiming to run (1 layer of the network). I would really appreciate any guidance with code guidelines or environment setup I should pay attention to in order to get my target layer working. Thank you so much in advance!
*EXAMPLES:
*Project config, various settings:
*Simple array eg:
The code below works for in and out sizes of less of 30. Once the arrays have size 30, the mem write data becomes undefined.
SRAM_DIN looks good (defined values), with size 5 here.
Breaks with size 30: see the red undefined xxx data getting written in SRAM_DIN.


*Eg target layer from the network: 1 Conv layer:
- Note: header file "bneck_config.h" contains definitions:
float ifmap_buf[3072] = {0, 1, 2,..,3071};
const float conv0_kernels [432] = { 5.5238e-01,2.0333e-01, ..};

0