Using the latest Keil C compiler (V7.00) Executing on an Infineon XC161CJ-16F This code appears to produce the wrong result.
At the "if" statement I expect v1 to equal v3 and for the main function to return 0 (I don't actually want to use the return value, its just a mock up piece of code). However, at the "if" statement, the value of v1 is 0x1234 and the value of v3 is 0xB1234.
Can anyone explain why?
typedef unsigned long ULONG; typedef unsigned short USHORT; typedef BYTE * POINTER; POINTER v1,v2,v3,v4; ULONG temp_ptr; int main(void) { v1 = (POINTER)0; v2 = (POINTER)0x0B1234; v1 += (ULONG)v2; // This assigns 0x1234 to v1, I think it should assign 0xB1234 v3 = (POINTER)0; v4 = (POINTER)0x0B1234; temp_ptr = (ULONG)v3; v3 = temp_ptr+v4; // This statement assigns 0xB1234 to v3 correctly. if ( v1!=v3 ) return -1; else return 0; }
When you quoted from that book, did you also notice this part: "What does that do? The pointer has been initialized to the value of 6 (notice the cast to turn an integer 6 into a pointer). This is a highly machine-specific operation, and the bit pattern that ends up in the pointer is quite possibly nothing like the machine representation of 6."
On one hand, the book is correct. But on the other hand, it isn't. The language standard does not contain any rule that says that the integer 6 (when cast into a pointer) will point to the 6th byte of any memory region. The assign may just as well result in an invalid pointer that either addresses air or results in an exception by the processor. The standard does not require the compiler to perform any magic translations when you typecast between integer and pointer.
The language standard, §6.3.2.3 bullet 5 says: "An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.56"
And bullet 6 says: "Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type."
The language standard, §6.5.6 bullet 9 talks about the situation where the subtraction between two pointers gives a result too large to fit in ptrdiff_t (a signed integer type), then the result is undefined.
But 6.5.6 also cover the situation where the addition of an integer to a pointer to an array element (or one step past the last array element) results in a pointer that isn't pointing to one element in the array or one step past last element.
"What exactly is the compiler doing in the situation where the value is not computed as I expect it to be. What interaction between the XC161's segmented addressing and the compiler's casting of POINTER to ULONG is impacting unexpectedly on the computation?"
So have you looked at the xhuge pointer type? Your code are just using the default pointer type for your memory model, and you haven't told us what that memory model is. The alternative is that you perform the required integer arithmetic to create the binary representation of good pointers of the pointer type you want to use. That may include a workaround for 14-bit offsets stored in 16 bits with 2 bits zeroed.
However, I think the compiler is not allowing for that segmentation the way I expect it to.
And that's because those epxectations are wrong. Not the compiler.
C is a language that was/is designed to access the hardware directly, otherwise it is a pretty pointless language with which to develop embedded systems.
And the way C does those things is by leaving them undefined at the language level. That leaves implementors exactly the kind of freedom they may need to perform all those less-than-portable things efficiently, and without immediately having to resort to utterly unportable extensions of the language itself. In way, C gives you just enough rope to hang yourself.
How can you do this without casting?
int *hardware_address = (int *)0x7FFF;
We can't. But that means those of us needing to do that have to find something else, beside the language itself, to base any expectations about what this actually does on.
Yes, I am using the xhuge memory model. I have 1/2 MB of RAM and 8MB of flash on board so I have to.
What I am trying to resolve is why there is a difference between the following two statements specifically when using the Keil compiler on an XC161 target:
char *x,*y;
x += (unsigned long)y; x = x+(unsigned long)y;
I would expect the C compiler to do exactly the same thing with both of those statements. I think both statements should give the same result, whether that result is correct, incorrect, undefined or something else doesn't really matter. I expect both results to be identical.
What I really want to know is why they do not give the same result, specifically when using the Keil compiler on an XC161 target? i.e. what are the implementation dependencies? I believe that the compiler is designed to handle the segmented addressing of the XC161 correctly (i.e. as a 32 bit integer), because it works as expected in the second example. If that second example also failed then yes, I would agree that I need to handle the segmentation myself.
Regards Paul
@Hans-Bernhard Broeker How can you do this without casting?
You've gone to great lengths to try and convince me that assigning pointers to integers and integers to pointers is undefined in C. It is not undefined you are wrong about that. It is implementation specific. I started this thread to try and discover the specifics of the Keil compiler's implementation of such things...
From the ANSI C specification (my emphasis).
3.3.4 Cast operators <snip> Conversions that involve pointers (other than as permitted by the constraints of $3.3.16.1) shall be specified by means of an explicit cast; they have implementation-defined aspects: A pointer may be converted to an integral type. The size of integer required and the result are implementation-defined. If the space provided is not long enough, the behavior is undefined. An arbitrary integer may be converted to a pointer. The result is implementation-defined./37/
I understand that the results of such pointer casting are implementation specific. However the ANSI C standard does not specify such operations to be undefined.
The only thing that can make those operations undefined is the implementation, i.e. the Keil compiler. All I am asking is why should the Keil compiler define the two examples I gave differently?
@Paul Blackmore,
I meant to say that the ARM compiler generates code that operates according to your original expectations (thus, it assigns 0xB1234).
The fact that an ARM's address space is naturally 32bit, while that of a C166 is basically 16-bit, quite probably caused the OP's expectations to be broken in the first place.
Yes of course. I seem to assume lately that the default architecture is "ARM"...
But your code does not do what you claim in this post. You claim that you do:
char *x,*y; x += (unsigned long)y; // add an integer to a pointer x = x+(unsigned long)y; // add a pointer with an integer
But the code in your original post does:
v1 = (POINTER)0; v2 = (POINTER)0x0B1234; v1 += (ULONG)v2; // This assigns 0x1234 to v1, I think it should assign 0xB1234 v3 = (POINTER)0; v4 = (POINTER)0x0B1234; temp_ptr = (ULONG)v3; v3 = temp_ptr+v4; // This statement assigns 0xB1234 to v3 correctly.
Your first add does add an integer (remember your type cast) Your second add does add a pointer.
As you can see, your recent post does not sum up your problem since you are there describing a completely different concept - no pointer added. Just two ways of adding an integer.
Since a pointer and an integer may have identical size but completely different representation, it really does matter for the compiler if you perform pointer arithmetic or integer arithmetic. And not only that - unless you can find a specific document claiming exactly the limitations to Keils pointer support, you really are busy in the undefined domain.
By the way - why do you claim to use the xhuge memory model? You don't need xhuge as a general memory model unless you want every single pointer to be able to span large data ranges. Why would there even be huge or far pointers if xhuge would be a good choice for general use?
Select the cheapest memory model you can. Then specifically classify the pointers that needs it as the more capable pointer type.
Firstly, I'd like to say thanks Per Westermark, I really appreciate the time and patience you have shown trying to help me work through this problem.
@Per Westermark
Ahh crap! I was having a bad day yesterday. You are absolutely correct. I was just trying to simplify the problem and got the simplification wrong, please ignore it. The originally posted code is what I should be referring to.
unless you can find a specific document claiming exactly the limitations to Keils pointer support, you really are busy in the undefined domain.
This is from the Keil documentation (my emphasis): The C166 compiler and the L166 Linker/Locater handles all the CPU addressing modes for you. Therefore the different CPU address calculations are totally transparent to the C programmer. <snip> The huge and xhuge pointer values are representing directly physical memory addresses.
I incorrectly thought my app was using the XLarge memory model and that the pointers by default were therefor xhuge pointers. That is not the case, my app is using the HLarge memory model and the pointers by default are therefor huge.
I have logged the fault with KEIL support, maybe they can shed some light on it. But if you have any more suggestions, I'd like to know, thanks.
-------------------------------------------------------------
If I change the code line:
typedef BYTE * POINTER; // that should have been char not BYTE by the way
to
typedef char xhuge * POINTER;
then the code works as expected.
That confirms that there is a difference in the way the compiler handles huge and xhuge pointers. However, it still does not explain why the code does not produce the same pointer value in the two different calculations.
For example, if huge pointers were only supposed to cast the low 16 bits when cast to an integer, then both pointer calculations should have resulted in 0x1234. If huge pointers were supposed to cast the full 32 bit address when cast to an integer, then both pointer calculations should have returned 0xB1234.
I'm not trying to argue that xhuge or huge should behave in a certain way (although I would like to know for sure how they are implemented to behave), I am merely trying to find out why doing the calculation one way yields a different result to doing it another way.
After studying the assembler I realize my original sample code is wrong. It should have been like this:
typedef unsigned long ULONG; typedef unsigned short USHORT; typedef char * POINTER; POINTER v1,v2,v3,v4; ULONG temp_ptr; int main(void) { v1 = (POINTER)0x0B0000; v2 = (POINTER)0x1234; v1 += (ULONG)v2; v3 = (POINTER)0xB0000; v4 = (POINTER)0x1234; temp_ptr = (ULONG)v4; v3 = v3+temp_ptr; if ( v1!=v3 ) return -1; else return 0; }
In the code above, at no stage am I adding more than 0xFFFF to any huge pointer and it all works as expected. For the longest time in this thread, I did not "get" that I was adding more than 0xFFFF to a huge pointer, firstly because I had thought (incorrectly) I was using XLarge memory model and for a few hours after that, because I was being a little bit thick.
So that just leaves the final question as to why this original code:
v3 = (POINTER)0; v4 = (POINTER)0x0B1234; temp_ptr = (ULONG)v3; v3 = temp_ptr+v4;
produced the expected result. Putting on my "thinking clearly" hat which I had obviously not been wearing before...
The calculation is adding a pointer (v4) to an integer (temp_ptr) the pointer is already pointing at 0xB1234 so adding an integer of value 0 will result in a pointer of 0xB1234.