Hi, just testing STM32H7 code... .
In a simple LED Blink program, I used for loops for simple delay:
for( int i= 0; i< 10000000; i++) iDelay++;
This code will generate 6 assembly lines needing 10...20 bytes of code.
If this loop runs inside a 32 Byte page of STM32H7 code, all nice, runs as expected, with 2.5nsec time per command for STM32H7 running at 400MHz.
Just If this loop code snippet extends over a 32byte flash page border, then the complete for loop will take 5...20 times longer time ... .
This is quite nerving, I would like to use such delay loops at least in initialisation code, of course not for exact timing, but at least approximately correct.
I found an ALIGN command in Keil arm assembler, but if I try to use this in a __asm{...} inline assembly, unfortunately the c compiler gives an error "#3061: unrecognized instruction opcode". I tried also "#pragma align 32" ... but his also does not work... .
Is there any way to convice the c compiler to fill up nop's automatically in an easy way for such an application?