This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

OMAP3: encountered a blocking structure assignment, occurring only on OMAP3 processors

  • Note: This was originally posted on 27th July 2012 at http://forums.arm.com

    That sounds like a compiler problem to me. Can you use objdump and see what the assembly is doing? There should be no functional difference between individual assignment vs structure assignment, so diffing the disassembly would be interesting.

    if you can narrow it down to one instruction which is blocking, and the address it is accessing it would be useful.
  • Note: This was originally posted on 29th July 2012 at http://forums.arm.com

    I've not looked in much detail, but as a thought, your code may not be compliant with the C++ standard; are you sure that "(m_buffer->bb + offs)" meets the alignment requirements of a "c_t"/"a_t" structure? It could be that the compiler is legally assuming a certain alignment for the structure copy and is using instructions which turn out to be incompatible with the alignment produced by your cast from a char array pointer.

    hth
    s.
  • Note: This was originally posted on 30th July 2012 at http://forums.arm.com

    I'm now almost certain that you are provoking the generation of LDM/STM to perform what turns out to be an unaligned copy. Instrumenting the "c->ca = a;" assignment yields something like:

    &(c->ca) = 0x9bb2018, &(a) = 0xffa37cd4, sizeof a = 24
    &(c->ca) = 0x9bb2053, &(a) = 0xffa37cd4, sizeof a = 24
    &(c->ca) = 0x9bb208e, &(a) = 0xffa37cd4, sizeof a = 24
    &(c->ca) = 0x9bb20c9, &(a) = 0xffa37cd4, sizeof a = 24
    &(c->ca) = 0x9bb2104, &(a) = 0xffa37cd4, sizeof a = 24

    where it can be seen that "&(c->ca)" can become completely unaligned.

    I suspect you either need to add some "__attribute__((__packed__))" to the structure definitions, or find a switch to disable use of LDM/STM as a gcc specific way of addressing this; or make the code [more] standards compliant by changing your structure extension array to be of the correct type rather than char (i.e. avoid the cast of a char pointer to a structure pointer).

    hth
    s.
  • Note: This was originally posted on 30th July 2012 at http://forums.arm.com

    GCC will automatically identify the source of these issues if you use "-Wcast-align":

    ...In member function 'void blocking_assignment::work()':
    ...warning: cast from 'char*' to 'blocking_assignment::b_t*' increases required alignment of target type
    ...warning: cast from 'char*' to 'blocking_assignment::c_t*' increases required alignment of target type

    hth
    s.
  • Note: This was originally posted on 1st August 2012 at http://forums.arm.com

    Thank you all for your fast and helpful answers. It seems we struggled into heavy alignment problems.

    As we have to keep our source working on different platforms and need to exchange our buffers between all platforms, we need a portable solution witch uses the same binary representation on all involved platforms.

    So I get the example working on all platforms, if I use following:

    .....
    if (m_enable_workaround) {

       union a_u {
          char *p;
          a_t *s;
       } t;
       t.s = &a;
       memcpy (&(c->ca),t.p,sizeof(a_t));

    } else {
       c->ca = a;
    }
    .....

    The performance impacts of this construct I haven't measured yet.

    Regards
    Wolfgang
  • Note: This was originally posted on 1st August 2012 at http://forums.arm.com

    This will "probably work" because most memcpy implementations include alignment checks, and fall back to a slower copy if src and/or dst are unaligned. However you still end up with a structure with incorrect alignment in memory. If you try and use that later as a real structure then you may well hit other problems. The compiler could use LDRD/STRD to read or write the 64-bit fields, for example, and as these require 64-bit alignment, you would hit the same problem.

    A quick and dirty approximation of binary compatibility across memory systems - if you align base pointers for any structure on an 8-byte boundary you are unlikely to hit a problem on any commonly used architecture.