Table in www.keil.com/.../c51_le_passingparmsinregs.htm lists registers are used to store different argument.
I saw arg1 and arg2 are shared the same registers R4~R7 when they are long type. It is incorrect. arg2 should be stored in fixed memory. The same way, when arguments are generic pointers, only arg1 is stored in R1~R3, arg2 and arg3 should be stored in fixed memory.
Reference to http://www.keil.com/support/docs/2314.htm
"I saw arg1 and arg2 are shared the same registers R4~R7 when they are long type."
I think you have misinterpreted the table, there!
I think that what the table means to say is that a 2nd parameter of type long or float will be passed in R4-R7 if available
Clearly, if R4-R7 have already been used for a 1st parameter of type long or float, then a 2nd parameter of that type will, as you say, have to "spill out" to a fixed memory location.
Similarly for Generic Pointers in R1-R3.
I agree that the table is not entirely clear about this! :-(
In the PDF manual, the table is followed by some examples with illustrate the above - these examples are missing from the web version. :-(
You need to contact Keil direct to report this issue:
"Keil support personnel do not monitor this forum and are not guaranteed to reply to your queries." http://www.keil.com/forum/
This is one of the weaknesses of the code generator, IMO. There are three register pairs for 16-bit values. We should be able to pass a U32 and a U16 solely in registers, R4..R7 for the U32 and R2..R3 for the U16. Instead, by the rules, the U16 parameter 2 gets spilled into memory because R4..R5, and R2..R3 remain unused. Neither of
f(U32, U16); f(U16, U32);
should spill any parameters.
I've often found myself declaring generic pointers even when I know the values must be in xdata simply to work around this problem.
f(U8* p, U32 val)
fits all into registers, as a generic/far pointer goes into R1..R3. But,
f(U8 xdata*, U32 val)
spills into memory, as the 16-bit pointer does not go into registers thanks to the conflict with the U32. So, by giving the compiler more information about the pointer, it generates larger and slower code. In practice, it's less painful to load a useless tag byte into the registers and ignore it than to shuffle registers back and forth to overlaid data memory.