This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Failure in sizeof() function while using Member-Function-Pointers

When using member-function-pointers I noticed that they have a size of 8Bytes each. However when I ask the compiler it says it is only 4Bytes. Having a couple of objects in you program which have member-function-pointers and are created dynamically with new will allways crash your program.

Please refer to the following source-code and documentation:


#include  <StdIO.h>
#include  <Reg164ci.h>

/*
        A simple external object
        */
class cCalculator {

        public:
                int calculate();

};

int cCalculator::calculate(){
        return 27;
}

/*
        The description of the member-function-pointer
*/
typedef int (cCalculator::*fptr)();



/*
        An object with an array of such member-function-pointers
        which is able to call these functions by an index.
*/
class cAnObject  {
public:
   cAnObject();
         ~cAnObject();

         fptr fptr_array[2];    // 2 x 8 bytes = 16 bytes (see watchwindow)

        int   member1;                  // 2 bytes
        int       member2;              // 2 bytes

                                        // --------------------------------------
                                        // total: 20bytes = 0x14bytes

};
/*
        The constructor
*/

cAnObject::cAnObject()  {
  member1 = 1;
  member1 = 2;
}

/*
        The destructor
*/
cAnObject::~cAnObject()  {
  member1 = 0;
  member1 = 0;
}

int main (void)  {

        // now we have the external object
        cCalculator calc1;
        cCalculator calc2;

        // and a couple of objects containing some function pointes
        cAnObject test1;
        cAnObject test2;
        cAnObject test3;

        fptr memberfunction = &(cCalculator::calculate);
        test1.fptr_array[0] =  memberfunction;


        // here we get the size of our objects (see watch window)
        volatile unsigned long object_addr1 = (unsigned long) &test1;
        volatile unsigned long object_addr2 = (unsigned long) &test2;
        volatile unsigned long object_sizeA = object_addr2 - object_addr1;
        volatile unsigned long object_sizeB = sizeof( test1 );
        //...

        // please notice that the static objects have a distance
        // which is different from its size ? why?
        // count the bytes and you see the sizeof() is wrong !!


        // now we simply create the dynamic object (with the wrong size), why ? compiler ?
        // which here leads to an -access violation-, due to
        // the wrong calculated size of the function pointers
        cAnObject * p_test4;
        p_test4 = new cAnObject;
        // and set the function
        p_test4->fptr_array[0] = memberfunction;

        // calc the size
        volatile unsigned long objectSize4 = sizeof(    (*p_test4)      );


        // ----
        // now the function call itself
        cCalculator * p_calc;

        // here we select our target object
        p_calc = &calc1;

        // do the static object call by an index == 0
        fptr p_fct1 = test1.fptr_array[0];
        int x = ( (p_calc)->*( p_fct1 ))( );

        // now use the other object
        p_calc = &calc2;

        // do the dynamic object call   by an index == 0
        fptr p_fct4 = p_test4->fptr_array[0];
        int y = ( (p_calc)->*( p_fct4 ))( );


        // if there is any time please notice that there is a second bug:
        //   try to remove the line "fptr p_fct4 = ..."
        //       and put it into one line like "   int y = ( (p_calc)->*( p_test4->fptr_array[0] ))( );      "
        // and your whole program will crash !

        // just print the result
        printf("%d = %d \r", x, y );

  while (1);
}

Parents
  • Well then:

    Arriving in the debugger watchwindow I can see 8bytes for each member-function-pointer in the array, what is for me proof enough. If the debugger says it is 8bytes then "I assume" it is correct.

    Now we can come to the malloc. As far as I know a simply new is in EC++ this:

    void *operator new(size_t size)  {
      void *ptr;
    
      if (size == 0) size = 1;
      ptr = (void *) malloc (size);
      if (ptr == NULL)  {
        __abort_execution (ec_outofmemory);
      }
      return (ptr);
    }
    


    --Basically a malloc with a given size. And malloc() is actually Standard C or is this only an assumption?

    As malloc doesn't know anything about the object to be created it relys on the given size. This is why size needs to be calculated the correct way. In my example my object has a size of (sizeof()) 0xc.

    Whenever I take a pointer p0 as proposed by "Per Westermark" and then take the next object (*p1 = p0+1) the compiler itself calculates the offset for "+1". As seen in the test results between p0 and p1 is a distance of 0x14. Although malloc would give 0xc the compiler would expect the next object after 0x14. Depending on whatever is after 0xc in the memory it is going to be overwritten.

    @Per Westermark: You are right "Kernighan & Ritchie" only stated that there is management overhead for a value of the current object size and a pointer to the next empty memory-space. But I noticed that in C166 it is 6bytes. What can be evaluated by simply getting two pointers from malloc.

    As shown with the testcases above we have a simple testcase without an answere.

Reply
  • Well then:

    Arriving in the debugger watchwindow I can see 8bytes for each member-function-pointer in the array, what is for me proof enough. If the debugger says it is 8bytes then "I assume" it is correct.

    Now we can come to the malloc. As far as I know a simply new is in EC++ this:

    void *operator new(size_t size)  {
      void *ptr;
    
      if (size == 0) size = 1;
      ptr = (void *) malloc (size);
      if (ptr == NULL)  {
        __abort_execution (ec_outofmemory);
      }
      return (ptr);
    }
    


    --Basically a malloc with a given size. And malloc() is actually Standard C or is this only an assumption?

    As malloc doesn't know anything about the object to be created it relys on the given size. This is why size needs to be calculated the correct way. In my example my object has a size of (sizeof()) 0xc.

    Whenever I take a pointer p0 as proposed by "Per Westermark" and then take the next object (*p1 = p0+1) the compiler itself calculates the offset for "+1". As seen in the test results between p0 and p1 is a distance of 0x14. Although malloc would give 0xc the compiler would expect the next object after 0x14. Depending on whatever is after 0xc in the memory it is going to be overwritten.

    @Per Westermark: You are right "Kernighan & Ritchie" only stated that there is management overhead for a value of the current object size and a pointer to the next empty memory-space. But I noticed that in C166 it is 6bytes. What can be evaluated by simply getting two pointers from malloc.

    As shown with the testcases above we have a simple testcase without an answere.

Children
  • Note that the debugger and the compiler are two different things, so if a bug is involved, the debugger may also be in error. Or the compiler may produce correct code but incorrect debug information, and thereby fool the debugger.

    I still do not like your output for p0+1 which does show a _negative_ size...

    Is there a difference between values you get with the debugger, and values you get by letting the compiler-generated code print sizeof(), and actual addresses? That should tell if the compiler do produce invalid code.

    About memory allocation: Some heap solutions will allow you to perform two allocations and compare the two pointers to deduce the amount of memory used for book-keeping and aligning (as long as the heap isn't fragmented). Some heap solutions - such as the buddy-system - may allocate fixed-sized blocks of "semi-random" location, and may only need a (separately stored) single bit to keep track of allocated/free memory pages. The address and/or the subdivision level to reach the memory page will imply the size of the memory block.

    That is why I wrote that the only thing you do know about a returned pointer is that it has at least enough alignment for the data, and that it points to an area of at least the requested number of bytes. If there are any hidden data structures before the returned address is unknown. If there are any extra bytes at the end of the block (because of padding, paging, ...) is unknown. This may also change between two versions of the runtime library, so the programmer may never make any assumptions based on previous experience. Not even if the second allocation will allocate a memory block at a higher or lower memory address.

  • by Per Westermark
    Is there a difference between values you get with the debugger, and values you get by letting the compiler-generated code print sizeof(), and actual addresses? That should tell if the compiler do produce invalid code.

    Yes there actually is. When I use the sizeof() in the debugger it says 0x14 (the distance of the two pointers p0 and p1). Please remember that I tried to create two dynamic objects where a write to the second object had overwritten data in the first object. When writting to the first object which is at the end of my memory pool and run the simulator I recived the -access violation- error. And indeed the "map" command of the debugger shows that a write beyond the allowed area took place.

    As you stated I noticed a difference between the internal library sizeof() and the one which is executed via the debugger.

    Thanks for your help on this.

  • Oh sorry, that actually was a typo. I probably was too enthusiastic. You are right, it is of course not negative its vice versa (but the same values!):

    correction

            cAnObject c;
            cAnObject *p0 = &c;
            cAnObject *p1 = p0+1;
    
    >> p0 = 0x10fac
    >> p1 = 0x10fc0
    

  • There are no internal library sizeof() function. sizeof is an operator (not a function and it doesn't need any parentheses) that works with meta-data, and the compiler will calculate all size information directly when compiling based on the type information.

    The debugger, on the other hand, will have to look at the debug information generated by the compiler to produce it's information.

  • If the distance between two objects in an array is 12 bytes (0x0c) but the distance when stepping a pointer is 20 bytes (0x14) then the compiler is most definitely broken!

    The sizeof operator must always produce the same value as the distance between two elements in an array, which has to be the same as the amount a pointer is incremented to step one object forward/backward.

    This is why a pointer of a base-class type may not be used together with arrays of objects - the compiler would not know how much to increment the pointer to reach the next element. It would assume the size of the base class.

  • Possibly it comes from the difference between linker and compiler. The interesting point is that the array from above (cAnObject array[2];) has elements which have a distance of 0x14 although sizof says it has only a size of 0xc.
    So this basically doesn't matter if you have static objects. But creating them dynamically causes errors due to the wrong internal sizeof.

  • "The interesting point is that the array from above (cAnObject array[2];) has elements which have a distance of 0x14 although sizof says it has only a size of 0xc."

    I don't know how to read this sentence.

    You said ealier that the size of a two-element array was 0x18 (24), which was twice the size of a single element (12). This should imply that the distance between the elements was 12 bytes, the same as sizeof gives.

    Where did you deduce that the distance between two elements in the array was 20 bytes? Note yet again that the debugger may be in error. If so, it may say that the distance is 20 bytes and show broken data in the second element.

  • Sorry. I try it again.

    You proposed to crate an array (cAnObject array[2];), where the sizeof said it has a size of 0x18 so a single object has a size of 0xc. (see testresults1)

    Now the strange thing:
    When I create two pointers of cAnObject which point to the first and the second element of the array, their difference is 0x14. Which is the same thing you figured out with the testresults2.

    But 0xc != 0x14. What underlines your theorem:
    The sizeof operator must always produce the same value as the distance between two elements in an array, which has to be the same as the amount a pointer is incremented to step one object forward/backward.

  • You you continue to claim "So this basically doesn't matter if you have static objects. But creating them dynamically causes errors due to the wrong internal sizeof."

    If the compiler does not use the same offset when doing array[1] or p0[1] or *(p0+1) it really doesn't matter if you have static or dynamic objects. The compiler would not be able to produce correct code for static arrays either.

    The question here is still: Have you produced a complete set of figures from the compiler, i.e. without gleaning info from the debugger windows?

  • Well I recived the variable-data from the watch-window, which includes (size_obj1 >> 0xC, size_array >> 0x18, >> p0 = 0x10fac, >> p1 = 0x10fc0 ).

    Is this ok?

  • @Hans-Bernhard Broeker:

    While reading this post I can say you are right, I claimed a couple of things which aren't backed up correctly. After hours of searching for the failure in our program I thought to have a rough understanding of what is going on, so I took the assumptions from experience what might be wrong. But I also noticed there is a missbehaviour of the compiler/linker/debugger or what else. So I proposed the code example from above. Which obviously didn't bring it to the point.

    Finally I support the post: If the distance between two objects in an array is 12 bytes (0x0c) but the distance when stepping a pointer is 20 bytes (0x14) then the compiler is most definitely broken!
    as this is actually true, what can be seen in the testresults1/2.

    Probably there is time and you can copy and paste the code from above and follow the idea for yourself. I think its worth trying it.

    Thanks.