Hi, just testing STM32H7 code... .
In a simple LED Blink program, I used for loops for simple delay:
for( int i= 0; i< 10000000; i++) iDelay++;
This code will generate 6 assembly lines needing 10...20 bytes of code.
If this loop runs inside a 32 Byte page of STM32H7 code, all nice, runs as expected, with 2.5nsec time per command for STM32H7 running at 400MHz.
Just If this loop code snippet extends over a 32byte flash page border, then the complete for loop will take 5...20 times longer time ... .
This is quite nerving, I would like to use such delay loops at least in initialisation code, of course not for exact timing, but at least approximately correct.
I found an ALIGN command in Keil arm assembler, but if I try to use this in a __asm{...} inline assembly, unfortunately the c compiler gives an error "#3061: unrecognized instruction opcode". I tried also "#pragma align 32" ... but his also does not work... .
Is there any way to convice the c compiler to fill up nop's automatically in an easy way for such an application?
...but I use C to come around assembler...
E. g. in the clock init code I have about 10 loops of the following sort:
#define IFLOOPEND_1ms 1000000 iDelay= 0; while( !(PWR->D3CR & PWR_D3CR_VOSRDY)) { if( iDelay++ > IFLOOPEND_1ms) return 0; }
... in worst case I really had to do this in assembler ... but writing an assembly function just because of this ... does this not sound a bit stupid ... at least really an idea, I will think about this ... .
But I think in C code quite often you have such small for loops ... Older processors did not use such large Flash buffers for reading ... and they were not so fast in execution timing ... in this STM32H7 this effect really gets quite nerving ... . And if the C compiler would support such a #pragma align 32, this would be super-great.