Hi, just testing STM32H7 code... .
In a simple LED Blink program, I used for loops for simple delay:
for( int i= 0; i< 10000000; i++) iDelay++;
This code will generate 6 assembly lines needing 10...20 bytes of code.
If this loop runs inside a 32 Byte page of STM32H7 code, all nice, runs as expected, with 2.5nsec time per command for STM32H7 running at 400MHz.
Just If this loop code snippet extends over a 32byte flash page border, then the complete for loop will take 5...20 times longer time ... .
This is quite nerving, I would like to use such delay loops at least in initialisation code, of course not for exact timing, but at least approximately correct.
I found an ALIGN command in Keil arm assembler, but if I try to use this in a __asm{...} inline assembly, unfortunately the c compiler gives an error "#3061: unrecognized instruction opcode". I tried also "#pragma align 32" ... but his also does not work... .
Is there any way to convice the c compiler to fill up nop's automatically in an easy way for such an application?
...
meanwhile I found the "mircacle command" to avoid these 32 byte flash page problems: If I use SCB_EnableICache() at start of my main, then all works with full speed, regardless wheter the for loops extend over 32 byte borders or not.
... just the Keil Disassembly then does not work as nice at before ... at least somehow it seems to have problems to show disassembly code if I click in the C code during run time ... if the program is at a breakpoint, it shows correctly around the breakpoint, but even then some problems I think if I scroll up/down in the disassembly code.