Using the latest Keil C compiler (V7.00) Executing on an Infineon XC161CJ-16F This code appears to produce the wrong result.
At the "if" statement I expect v1 to equal v3 and for the main function to return 0 (I don't actually want to use the return value, its just a mock up piece of code). However, at the "if" statement, the value of v1 is 0x1234 and the value of v3 is 0xB1234.
Can anyone explain why?
typedef unsigned long ULONG; typedef unsigned short USHORT; typedef BYTE * POINTER; POINTER v1,v2,v3,v4; ULONG temp_ptr; int main(void) { v1 = (POINTER)0; v2 = (POINTER)0x0B1234; v1 += (ULONG)v2; // This assigns 0x1234 to v1, I think it should assign 0xB1234 v3 = (POINTER)0; v4 = (POINTER)0x0B1234; temp_ptr = (ULONG)v3; v3 = temp_ptr+v4; // This statement assigns 0xB1234 to v3 correctly. if ( v1!=v3 ) return -1; else return 0; }
I don't use your C166 compiler, so I can't really give good answers about it. But does your compiler - at least with the memory model you are using - support huge pointer arithmetic?
Quite a number of compilers having pointers larger than 64kB requires a specific memory model or a specific keyword for the pointer to allow arithmetic to produce normalized pointers and/or support 32-bit arithmetic.
By the way - exactly why do you play with this kind of arithmetic? It isn't really meaningful in a real program unless you do have an object that is larger than 64kB - and in that case, you don't need any type cast from pointer to integer since the offset was an integer in the first place. The closest a "normal" program would get would be to have one pointer to a start of an object or memory region and another pointer to the first byte after the object - or to the next memory region. And then trying to compute the distance by subtracting the values of the two pointers.
This code appears to produce the wrong result.
That's quite impossible, because that code produces nothing but undefined results. Those cannot possibly be wrong, because by definition whatever you get is correct.
At the "if" statement I expect v1 to equal v3
That expectation is unfounded.
@Hans-Bernhard,
I must admit that I failed to immediately find the problem - I even scanned the OP's code with PC-lint. Can you please explain why this code induces undefined behavior?
Using the ARM complier, this code behaves according the OP's comments.
explain why this code induces undefined behavior?
I didn't say undefined behaviour --- I said undefined result.
Any code treating integers and pointers as interchangeable eventually creates undefined results. For all the language cares, the program jumped off the tracks at the point the first non-zero integer was stored in a pointer variable.
Any expectations about the behaviour of such code are essentially built on vacuum.
You pretty much answered your own question there. The result of that code changes just because you switch compilers --- because it's result is undefined.
The fact that an ARM's address space is naturally 32bit, while that of a C166 is basically 16-bit, quite probably caused the OP's expectations to be broken in the first place.
@Per Westermark But does your compiler - at least with the memory model you are using - support huge pointer arithmetic?
Yes.
@Per Westermark By the way - exactly why do you play with this kind of arithmetic?
It is used in the implementation of relocatable, self-referencing data structures. The 64KB data structure is initially compiled on a PC based on a large number of configuration files.
The PC based compilation includes setting "pointers" in the data structure to offsets within the data structure. Those "pointers" contain only ULONG values since the absolute memory location of the data structure is not known yet.
I put "pointers" in quotes above because they are not really pointers, at this stage they are only ULONG offsets from the start of the data structure.
When the software executes on the XC161, the data structure is loaded from flash into one of multiple possible RAM locations. Those locations are by necessity, on 64KB boundaries.
A small routine scans through the data structure, adding the absolute RAM location of the data structure to the ULONG offsets in the data structure. That produces absolute pointers that are then used by the code without having to constantly re-compute the offset from the start of the structure.
The reason it was done this way was for efficiency in both code space and execution time.
Regards Paul
@Hans-Bernhard Broeker Any code treating integers and pointers as interchangeable eventually creates undefined results. For all the language cares, the program jumped off the tracks at the point the first non-zero integer was stored in a pointer variable.
While I appreciate your purist point of view, when coding for embedded systems there are times when you need to, in fact MUST cast integers to pointers.
How can you do this without casting?
int *hardware_address = (int *)0x7FFF;
@Tamir Michael Using the ARM complier, this code behaves according the OP's comments.
I'm sorry I don't understand which way you mean that comment.
Are you seeing the same unexplained behaviour, i.e. that v1!=v3? Or are you seeing what I expected to see, that v1==v3?
Pointers often don't support any random additions/subtractions of integers.
But next thing is that with a pointer larger than 16 bits on an architecture that have a 16-bit int, the compiler may not support the use of offsets larger than 16-bit.
But it is quite common for compiler to normally prioritize efficient code by assuming that normal programs wants pointers that may span the available memory but don't need to access any data structure larger than what the processor can efficiently add with a single instruction or 64kB whichever is larger. But that limit may be smaller or larger depending on the instruction set of the target processor. See more below.
With the segmented pointers in 8086 processors, a large pointer have a 16-bit segment address that (in real mode) spans 1MB because the value is multiplied by 16. And it has a 16-bit offset. And all pointer arithmetic will only work on the 16-bit offset so memset() etc are limited to at most 64kB of data. Most 8086 compilers also implements a huge pointer, where the offset is constantly normalized to a value 0..15, and where pointer arithmetic affects both offset and segment, allowing the pointer to be stepped forward more than 64kB without any offset overflow.
This link shows the far pointer for C166, and describes that it is not a linear 32-bit number. There are even bits in the offset part that are not used so the offset is limited to 14 bits. http://www.keil.com/support/man/docs/c166/c166_ap_farptr.htm
But in the C166, even the huge pointer is limited. The documentation says it can only address objects up to 64kB in size: http://www.keil.com/support/man/docs/c166/c166_ap_hugeptr.htm
So a program that needs to play with even larger structures needs the xhuge pointer type: http://www.keil.com/support/man/docs/c166/c166_ap_xhugeptr.htm
The ARM have nice 32-bit linear pointers behaving identically to 32-bit integers. So type casting and adding/subtracting/multiplying will give nice results.
But see the links in my other post regarding the C166.
A developer who do play fancy with pointers should better create a target-specific function that _creates_ a valid pointer based on some input data. Then you can conditionally compile this function depending on target hardware.
The follow up here is: The standard doesn't claim that it is valid to typecast between pointer and integer or reverse. But 0 is a null pointer.
So whenever you feel like typecasting between integer and pointer you need to have target-specific knowledge to understand if it is valid or not and exactly what the result will be.
The result of the above cast is completely depending on compiler and target architecture. For almost all 8-bit and 16-bit processors, it will result in a good pointer. But not because the language standard promises that but because lost of processors just happens to have 16-bit pointers storing linear addresses identically to the storage of 16-bit integers.
Many compilers for more "interesting" target architectures often have target-specific helper macros to create pointers from integers. Maybe MK_FP(xxx) or similar. But if the coder uses them or not doesn't change the fact - any program that needs to create non-null pointers from integers are non-portable and do require the developer to have target-specific knowledge.
@Hans-Bernhard Broeker
I think that Tamir Michael may have been saying the ARM compiler does exactly what the XC161 compiler does - at least that's how I read it. I've asked for clarification.
I too believe it is the segmented addressing of the XC161 addressing that is causing the problem. However, I think the compiler is not allowing for that segmentation the way I expect it to.
By the way, I disagree with you that it is undefined. I know what the target hardware is, I know what the target CPU architecture is. I am not writing "portable" code to run on multiple disparate architectures. C is a language that was/is designed to access the hardware directly, otherwise it is a pretty pointless language with which to develop embedded systems.
For example, this taken from here: publications.gbdirect.co.uk/.../pointers.html
The only values that can be assigned to pointers apart from 0 are the values of other pointers of the same type. However, one of the things that makes C a useful replacement for assembly language is that it allows you to do the sort of things that most other languages prevent. Try this:
int *ip; ip = (int *)6; *ip = 0xFF;
What does that do? The pointer has been initialized to the value of 6 (notice the cast to turn an integer 6 into a pointer). This is a highly machine-specific operation, and the bit pattern that ends up in the pointer is quite possibly nothing like the machine representation of 6. After the initialization, hexadecimal FF is written into wherever the pointer is pointing. The int at location 6 has had 0xFF written into it—subject to whatever ‘location 6’ means on this particular machine.
@Per Westermark
The standard doesn't claim that it is valid to typecast between pointer and integer or reverse. But 0 is a null pointer.
I know that, but it is necessary sometimes in embedded systems.
I agree I need to know the target architecture. That's the whole point of my question.
What exactly is the compiler doing in the situation where the value is not computed as I expect it to be. What interaction between the XC161's segmented addressing and the compiler's casting of POINTER to ULONG is impacting unexpectedly on the computation?
If anyone can answer that, it will go a long way to helping me understand the target hardware just that little bit better.
I know I can solve it using various workarounds, if all I wanted to do was to solve the immediate problem and move on then I wouldn't have even bothered posting the question. I have a desire/need to learn what is going on under the hood.
I've just realized a bunch of my replies have been posted "inline" earlier in the thread. Not sure if you've read through them, but hopefully they help explain what I'm doing, why I'm doing it and the answer(s) that I'm trying to find.