2 - The ARM start to execute the instruction and lock destination registers (to prevent and other instruction using the same registers as source)For example with our previous MULRd is written to be lockd until cycle #16 (#10 + Rd : E5 + 1 because the mul take 2 cycle, and destination stage are always given for the last cycle of a multicyle instruction)
I thought the branches to .else will always be mispredict, but it was not the case.It could be very usefull to know the prediction algorithm (but I assume it must be quite secret )!!!
For branch :I'm do not know anything about the first stage of the ARM pipeline.I don't know what you want to do.
But, I think that there is no way to know just with a code source if a (conditional) branch will be mispredict or not.
1 - The ARM check before starting an instruction that all the registers will be available when the instruction will need them.For example:you want to execute a MUL Rd, Rm, RsRm must be available at cycle #11 (#10 + 1 see MUL cycle table http://infocenter.ar...ch16s02s03.html)If at least 1 register is not avalable, then the ARM do not start the instruction and you have a stall cycle.
http://pulsar.websha...x-A8-cycle.xlsx
Buy a beagleboard... http://www.watterott.../BeagleBoard-xMThis is not very expensive !!!
I'll write a post to explain how works the cycle counter and how you can write your own cycle counter in few days (weeks)...That will be more more simple that triyng to explain part by part how the program works !!!
But your solution is not a good solution... to much work !!!
for MRS and MSR: there is a lot of instruction that I've not found real cycle timing and I do not have time to test.
Take the last version (but keep the previous one because I've change a lot of things).
For example I remove all the STM and LDM rules. There is to many case. Now I build this rules automaticaly in the cycle counter.
Ben avison have made a very usefull work for thathttp://www.avison.me.../cortex-a8.html