Hello,
What is the purpose of the RSDIS (Return Stack DISable) bit in ACTLR ?
What would be the consequence on code execution if set DISable ?
Is the software able to write this bit ?
Thanks for help
Hello syl,
as the result, the RSDIS=1 disables the return address prediction from a function.
Please refer to the following descriptions from "Cortex™-R4 and Cortex-R4F Technical Reference Manual Revision: r1p4 (ARM DDI 0363G)".
5.3. Return stack
The call-return stack predicts procedural returns that are program flow changes such as loads, and branch register. The dynamic branch predictor determines if conditional procedure returns are predicted as taken or not-taken. The return stack predicts the target address for unconditional procedure returns, and conditional procedure returns that have been predicted as taken by the branch predictor.
The return stack consists of a 4-entry circular buffer. When the PFU detects a taken procedure call instruction, the PFU pushes the return address onto the return stack. The instructions that the PFU recognizes as procedure calls are:
•for ARM and Thumb instructions:
â—¦BL immediate
â—¦BLX immediate
â—¦BLX Rm.
When the return stack detects a taken return instruction, the PFU issues an instruction fetch from the location at the top of the return stack, and pops the return stack. The instructions that the PFU recognizes as procedure returns are, in both the ARM and Thumb instruction sets:
•POP {..,pc}
•LDMIB Rn{!}, {..,pc}
•LDMDA Rn{!}, {..,pc}
•LDMDB Rn{!}, {..,pc}
•LDR pc, [sp], #4
•BX Rm.
Return stack mispredictions can exist when:
•The prediction that a conditional return passed or failed its condition code is not correct.
•The return address is not correct. The DPU resolves indirect branches that the return stack predicts at the Ret-stage of the pipeline, see Figure 1.3. A misprediction causes the PFU to flush the pipeline and fetch the correct instruction stream.
The return stack has no underflow or overflow detection. Either scenario is likely to cause a misprediction.
Note
The MOV PC, LR instruction is not decoded and is not predicted as a return.
5.4. Controlling instruction prefetch and program flow predictionYou can disable the return stack by setting RSDIS in the ACTLR. When disabled, pushes onto the stack caused by call instructions are disabled, but the stack pointer is not frozen.
Best regards,
Yasuhiko Koumoto.
Thanks Yasuhiko,
The answer is very clear.
Note there is a discrepancy with the text in Cortex R5 TRM DDI0460D about stack pointer :
"...When disabled, pushes onto the stack caused by call instructions are disabled and the stack pointer is frozen."
Which document is right ?
Sylvain
Hi,
I checked the Cortex R5 TRM DDI0460D and found the same description in chapter 5.4 as "and the stack pointer is frozen".
I cannot find any discrepancies.
The stack pointer in the sentence "and the stack pointer is frozen" will be one of the return stack which holds the predicted return addresses.
I am afraid you are thinking of the normal stack pointer which is denoted as R13.
R13 and the return stack pointer are not equal registers.
I was comparing the DDI460D text to your first answer which quotes "...is not frozen" ( extract of ARM DDI 0363G).
They are different : first one says frozen, second one sys not frozen
I don't think this makes any difference in processor behavior but it remains the documents are not coherent.
Oh!,
I'm sorry but I had been overlooked.
I think that the return stack is not frozen for Cortex-R4 and is frozen for Cortex-R5.
Anyway, it would not a big issue because the return stack pointer is invisible.
There are discrepancies between the two documents but in this case it is not an error. When RSDIS is set, the return address stack pointer behaves differently in Cortex-R4 and Cortex-R5. If you will notice, when the prediction is forced to a fixed direction (through the BP field in the ACTLR), the global history table also behaves differently.
Hi Sylvain,
RSDIS enables (RSDIS=0) or disables (RSDIS=1) the return address stack used in procedure call/return program flow prediction.
In Cortex-R4/R5, if program flow prediction is not entirely disabled (by also setting the branch prediction policy to always not-taken), the remaining branch prediction schemes are still in effect.
For programs that can exploit the procedure call/return program flow prediction, the enhancement in execution speed if the return address stack is ENABLED can be estimated through some data provided in Cortex-R4/R5 TRMs. Here are some excerpts from Cortex-R5 TRM Rev. r1p2 (ARM DDI 0460D) section B.9 Branches:
From page B-15, Table B-10 shows that for the two BX instruction formats, execution is 9x faster if return stack prediction is CORRECT.
From page B-17, Table B-13 shows LDR to PC instructions, execution can be up to 9x faster if return stack prediction is CORRECT.
From page B-21, Table B-18 shows LDMIAs with PC in register list, there are additional 8 cycles if return stack prediction is INCORRECT.
In most cases, I think a disabled return address stack is equivalent to incorrect return stack prediction in terms of the number of cycles needed in executing the type of instructions cited in the tables.
Generally, ACTLR (where RSDIS is contained) is a Read/Write register but accessible in privileged mode only.
For Cortex-A5, there are also additional caveats:
In Non-secure state,
Attempts to write to ACTLR in secure privileged modes when CP15SDISABLE is HIGH result in an Undefined instruction exception.
Regards,
Goodwin
For Table B-13 I stated execution can be up to 9x faster but I failed to include the Memory cycle. It should be either up to 5x faster if return stack prediction is CORRECT or there are additional 8 cycles if return stack prediction is INCORRECT.