The subroutine(C source code) is the same, but when compiled:
The DEMO's disassembly:
PUSH {r4-r7} ... ... ... POP {r4-r7} BX lr
My project's disassembly:
PUSH {r4, lr} ... ... ... POP {r4, pc}
The situation is I clean all the RAM during the subroutine to 0. So after "POP {r4, pc}", the program is crashed. I don't konw assembly language very well. So what could I do to make the ARMCC generate disassembly code using "BX lr" to make my code work? I'm using STM32F103/MDK5/ARMCC5.06.
If your optimization level is 0 it often (probably even always) will push and pop the lr "just because" even with no function calls within to called function. I would be careful with any assumptions about what the compiler must do. I would still want my code to work at optimization level 0 so I would just write this in assembly language to be safe. (you can examine the C to assembly listing to get a good idea of how the assembly language should be structured). I would execute it before __main is called.
I think ARM's got some 30+ years of experience building compilers/assemblers, and knowing if a subroutine has to branch to another, or not, and where it is going to stuff the literals, place near branches, etc. Optimization enabled, or not.
Keil, with optimization disabled...
i.foo foo 0x08000922: f2430039 C.9. MOV r0,#0x3039 0x08000926: 4770 pG BX lr
My assumptions are based on some pretty long standing appreciation of what comes out of assorted tools, and the "trust but verify" approach to confirming them when needed.
The OP is building a function that calls other code, either known or unknown to him. A full disassembly of the routines in question would quickly dispel any doubt, but we argue back-and-forth based on a handful of prologue/epilogue code instructions about what's going on inside. So instead of a solution in minutes it goes for days.
I don't think you can stop the compiler outputting the push/pop of LR/PC unless you force it to in-line everything it might use or call otherwise, and I think that is unduly hopeful.
In things where testing RAM is supposed critical, you'd hope the people implementing the critical code would have a better appreciation for what's actually critical.