I want to protect myself from stack overflow. From several articles I got the idea that I can locate the stack in the bottom of the RAM, before .bss section. Since on Cortex M stack grows down, on stack overflow my code will attempt to write in non-existing memory and I'll get an exception. And exception is much better than quit corruption of static data.
I used project for stm32f10x, so 0x08000000 is the beginning of the flash and 0x20000000 is the beginning of RAM.
Using www.keil.com/.../armlink_pge1362065977713.htm i was able to come up with scatter file like this one (based on generated file):
LR_IROM1 0x08000000 0x00020000 { ; load region size_region ER_IROM1 0x08000000 0x00020000 { ; load address = execution address *.o (RESET, +First) *(InRoot$$Sections) .ANY (+RO) } ARM_LIB_STACK 0x20000000 EMPTY 0x400 ; Stack region growing down { } RW_IRAM1 0x20000408 0x00005000-0x408 { ; RW data .ANY (+RW +ZI) } }
Here I create the stack area of size 0x400 and the rest of the RAM is given to IRAM1 section. So far so good, it seems to work. However, there are three things that puzzle me:
1) When I tried to do "RW_IRAM1 0x20000400 0x00005000-0x400", I got linker about overlapping regions of ARM_LIB_STACK and RW_IRAM1 (although I can't reproduce it right now wich is even more odd).
2) I presumed that this way the beginning of the stack would be exactly at the end of the stack region. However, when I look in the first entry of the vector table, I see there a value of 0x20000410. And this value seems to change not according the size of the stack region but according the beginning of the RW_IRAM1 region.
3) I had to edit my startup.s file and set Stack_Size equal to 0; otherwise initial stack pointer value in the vector table was a sum of 0x20000410 and Stack_Size.
So my question is - am I on the right track? Is this the correct way to split .bss and stack regions? What is causing this dispersancy between stack region and and initial stack pointer value from vector table? My test global variable is placed at the beginning of the IRAM1 and it is initialized correctly, can I pretend that everything is just fine?
>You can add guard variables to protect stack frames:
And I will have to check them periodically. How exactly is this better then my suggestion?
>On top of that, you could use the Memory Protection Unit MPU, to prevent access to are below the stack, so you will hit a memory fault if you go to far.
Unfortunately, I don't have MPU.
> www.keil.com/.../armcc_chr1359124223721.htm
Your link actually suggests: "Use RTSM, and define a region of memory where access is not allowed directly below your stack in memory, with a map file. If the stack overflows into the forbidden region, a data abort occurs, which can be trapped by the debugger."
So I'm trying to achieve this effect (similar to MPU actually) by relocating the stack. Why everybody keep discouraging me from that? :)
This is obviously an issue nobody has ever considered.
But seriously, it's not difficult to position the stack at an alternative location. What you're suggesting doesn't seem very advantageous to me though and I can't see it working as you think it might.
Without a memory management unit, you'll likely to always have the danger of experiencing a stack overflow without the ability to 100% guarantee picking it up. I've always followed the approach of calculating worst expected case use and adding something extra for safety.
well, because you are obviously finding it a struggle to do that!
so just trying to offer (potentially) easier paths.
As J Roof says, you taking a "path less trodden" - so it would be interesting to hear your results in the end.
Good luck.
J Roof, let me explain again than: I suggest using this kind of memory map:
top of RAM <potentially empty space> .bss stack (growing down) bottom of RAM
So when stack overflows it writes to the nonexisting memory below bottom of RAM and triggers HardFault exception.
It won't work in three cases: 1) If there is memory below bottom of RAM (seems unlikely) 2) MMU does not trigger exception when nonexisting memory is accessed (seems very unlikely) 3) User code writes above initial top of the stack (i.e. negative array index access). Possible, but not very frequent.
>I've always followed the approach of calculating worst expected case use and adding something extra for safety.
That can be really hard when we use IRQ and virtual calls; I suspect that's why not even gcc and clang don't do it statically.
There is a fourth:
4) A bad pointer happens to be pointing into the Stack somewhere
I think I found out the source of my problem. I used default startup.s file where stack is created like this:
Stack_Size EQU 0x00000400 AREA STACK, NOINIT, READWRITE, ALIGN=3 Stack_Mem SPACE Stack_Size __initial_sp
And that just allocated Stack_Mem in the IRAM1 section, that's all. When I set Stack_Size equal to zero, it was still allocated in the IRAM1 section (and for some reason __initial_sp was equal to IRAM1 start + 8 but nevermind).
I edited my scatter file like this:
ARM_LIB_STACK 0x20000000 0x400 ; Stack region growing down { *(STACK) }
and now Stack_Mem area is placed into this ARM_LIB_STACK section. And that's it. Now initial steck pointer value is equal to the end of ARM_LIB_STACK section.
Unfortunately, now I have to set stack size in two places - in scatter file and in startup file, but I think that can be solved by using preprocessor or just by bearing with it.
Andrew Neil, well, yeah, but 3 and 4 aren't actually stack overflows, so these are not the problems I'm trying to solve right now.
Agreed.
But are you sure that your problem actually is a Stack overflow?
If it's actually one of these, than all this work will be to no avail ...
It should be possible for the Linker to export symbols to be visible to the source code...
eg, www.keil.com/.../armlink_pge1362065951495.htm
Andrew Neil I guess I should've stated my problem more firmly. My problem was as follows: I have an IRQ handler which is triggered sporadically by some external event. If I am unlucky and it gets called on top of sufficiently deep function call, stack overflows and IRQ handler overwrite part of .bss section. Quietly.
Then I got an assertion in completely different place or (and that's even worse) I don't and my program is broken without me knowing it.
Andrew Neil, I found another way, that won't require changing startup file (I would prefer using the default one because there is a lot of them for different MCU's).
It turns out that linker can calculate the length of the section like this:
ARM_LIB_STACK RAM_BEGIN ; Stack region growing down { *(STACK) } RW_IRAM1 ImageLimit(ARM_LIB_STACK) (RAM_SIZE_BYTES-ImageLength(ARM_LIB_STACK)) { ; RW data *(+RW +ZI) }
So now RW_IRAM1 is placed just after ARM_LIB_STACK RAM_BEGIN and size of the stack is (as it was before) set in startup.s file.
I had to remove .ANY selector from RW_IRAM1 section but I believe that is fine (is it?)
You state:
2) MMU does not trigger exception when nonexisting memory is accessed (seems very unlikely)
When there is no MMU, there is certainly no guarantee that you'll get an exception simply because there is no memory physically present at that location.
As I said before, I can't see it working as you think it might.
J Roof, by MMU I meant the device that just maps software address to hardware address, with no ability to set permissions or isolate threads.
In my experience, when you try to read/write nonexisting memory, you get an exception. I'm not actually sure what part of the processor triggers this exception, I supposed it was memory mapping unit of the very simple kind. The same things that triggers HardFault on unaligned memory access.
But you are correct, I'm not certain that it will always happen, it's just my personal experience with several Cortex M MCUs; I couldn't google anything definitive.
In my experience, when you try to read/write nonexisting memory, you get an exception
Sometimes you might and sometimes you might not. It will to a large part depend on the address region. Certainly on a number of processors I've coded for (including Cortex), I have not had an exception in many of these situations. With some processors, an access to a reserved location can just trigger a processor lockup. Good luck with trapping something like that!
Then, of course you should consider whether the memory is mirrored over a larger address space. That would potentially scupper the concept.
You might find yourself lucky, but you would be wrong to rely on it.
J Roof, fair enough but it's still not worse than what I have by default. And sometimes, on MCUs that I just tested it (stm32f1x) it will be better.
With no fault, I suppose, reads from overflowed stack frame would return garbage (or zeros), so guilty function will likely fail by itself. Processor lock-up can be debugged with prints.
At least there won't be any corruption of non-related memory and delayed fail.
So I guess it's still worth a try.
In scatter File
ARM_LIB_HEAP +0 ALIGN 8 EMPTY 0x00000100 ; provide location if important to you ; which it is { } ARM_LIB_STACK +0 ALIGN 8 EMPTY 0x00000400 {
If you want to change the size of the stack or heap or location of these, this (the scatter file) is the only place you need to change it.
extern int Image$$ARM_LIB_STACK$$ZI_Base; // These are available linker symbols extern int Image$$ARM_LIB_STACK$$ZI$$Limit; extern int Image$$ARM_LIB_HEAP$$ZI$$Base; extern int Image$$ARM_LIB_HEAP$$ZI$$Limit; vector_table[] = &Image$$ARM_LIB_STACK$ZI$$Limit, // 0x20000400 in your examples // The top of your stack defined in your scatter // file put in entry 0 of your vector table // No size of stack needed in startup file // only in Scatter file // Do not provide a __user_initial_stackheap function Reset_Handler,