This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

LDRD instruction confused me, please help

I used keil mdk v5.1 for arm to develop my project.
I encounterd a serious problem about LDRD instruction.
Look at c codes below:

#include <stdint.h>

char b[2];
char c[4];
char d[2];

typedef struct {
        uint32_t                m_front;                        /* ¶ÓÍ·         */
        uint32_t                m_rear;                         /* ¶Óβ        */
        uint16_t                m_maxData;                      /* ¶ÓÁÐÖÐÔÊÐí´æ´¢µÄÊý¾Ý¸öÊý */

        uint8_t         (* ReadEmpty)();                /* ¶Á¿Õ´¦Àíº¯Êý             */
        uint8_t         (* WriteFull)();                /* дÂú´¦Àíº¯Êý             */
} DataQueue;

void SystemInit()
{
}

void main()
{
    DataQueue* p = (DataQueue*)c;

    char* q = c;
    char* xx = b;
    char* yy = d;

    if (p->m_front == p->m_rear)
    {
        int i = 0;
    }
}

disassembly of part of c codes is below

    28:     if (p->m_front == p->m_rear)
    29:     {
0x08000194 E9D04500  LDRD     r4,r5,[r0,#0]
0x08000198 42AC      CMP      r4,r5
0x0800019A D101      BNE      0x080001A0
    30:         int i = 0;
0x0800019C 2400      MOVS     r4,#0x00
    31:     }

I started a debug session.
When I stepped over sentance "if (p->m_front == p->m_rear)",
I found that program crashed(Hard fault error).

i thought that LDRD need value of r0 to a multiple of 4,
but r0 value was 0x20000002 at this time.

so i have to add __align(4) keyword to make sure that address of "c" is multiple of 4 and everything is fine.
you can see the codes below

#include <stdint.h>

char b[2];
__align(4) char c[4];
char d[2];

typedef struct {
        uint32_t                m_front;                        /* ¶ÓÍ·         */
        uint32_t                m_rear;                         /* ¶Óβ        */
        uint16_t                m_maxData;                      /* ¶ÓÁÐÖÐÔÊÐí´æ´¢µÄÊý¾Ý¸öÊý */

        uint8_t         (* ReadEmpty)();                /* ¶Á¿Õ´¦Àíº¯Êý             */
        uint8_t         (* WriteFull)();                /* дÂú´¦Àíº¯Êý             */
} DataQueue;

void SystemInit()
{
}

void main()
{
        DataQueue* p = (DataQueue*)c;

    char* q = c;
    char* xx = b;
    char* yy = d;

    if (p->m_front == p->m_rear)
    {
        int i = 0;
    }
}

i was confused if this was an bug of compiler?
Please help me

  • >>i was confused if this was an bug of compiler?

    No, this would be YOU casting an inappropriate structure to one that expects certain alignment. Casting usually means you know what you are doing, and you are telling the compile to do it anyway.

    This would be like casting a uint8_t* to a uint32_t* on an ARM7/9 and doing an unaligned read which the core does not support. And why portable code would use memcpy() to unpack a 32-bit word from an unaligned/packed memory/file structure.

    Sloppy things the x86 permitted often do not work on RISC load/store machines, where the additional logic and cycles are eschewed.

    Your char array isn't even big enough.

    DataQueue c[4]; would be viable

  • Dear Pier

    Thanks for your reply.
    I saw that char array is not big enough.
    But even it was enlarged, such issue still existed.

    My MCU was STM32F103C8 which core was cortex-m3.
    Assume that char array's first element's address is 0x20000000.
    I tested severval unaligned read below.

    #include <stdint.h>
    
    char array[200];                              //first element's address is 0x20000000
    
    typedef struct {
            uint32_t                m_front;                        /* ¶ÓÍ·         */
            uint32_t                m_rear;                         /* ¶Óβ        */
            uint16_t                m_maxData;                      /* ¶ÓÁÐÖÐÔÊÐí´æ´¢µÄÊý¾Ý¸öÊý */
    
            uint8_t         (* ReadEmpty)();                /* ¶Á¿Õ´¦Àíº¯Êý             */
            uint8_t         (* WriteFull)();                /* дÂú´¦Àíº¯Êý             */
    } DataQueue;
    
    void SystemInit()
    {
    }
    
    void main()
    {
    
        uint32_t arrayAddr = (uint32_t)array;
    
        uint32_t* pINT = (uint32_t*)(arrayAddr + 1);  //address is 0x20000001
        uint32_t valINT = *pINT;                      //It's OK. It used LRD instruction.
    
        float* pFLOAT = (float*)(arrayAddr + 1);      //address is 0x20000001
        float valFLOAT = *pFLOAT;                     //It's OK
    
        DataQueue* pQueue = (DataQueue*)(arrayAddr + 1);//address is 0x20000001
        uint32_t front = pQueue->m_front;                  //It's OK. It used LRD instruction.
        uint32_t rear = pQueue->m_rear;                    //It's OK. It used LRD instruction.
    
        if (pQueue->m_front == pQueue->m_rear)
        {
            front = rear;                             //crashed here
        }
    }
    
    

    I found that only LDRD instruction would crash program, but LRD instruction is OK.
    If i used IAR compiler to compile this program, it's OK.
    Because there is no LDRD instruction.

    There were many casting structure(packed or unpacked) such as DataQueue in my project.
    what should i do to prevent those crashes?

    Looking forward to your reply.

  • "I encounterd a serious problem about LDRD instruction."

    Not at all. You encountered a serious problem about your C skills. Why did you even involve any char array for that data queue? And why isn't there any code to initialize that DataQueue, since the default zero-initialization of global variables will not give you any values for m_maxData, ReadEmpty or WriteFull?

    A basic assumption when you are encountering serious problems should be that you have done something wrong - not that there are anything wrong with the tools. The C language standard and the compiler manual will contain the "contract" between you and the programming language/compiler. If you go outside of that contract, then you are on your own. Whenever you are about to make a type cast, it's time to really consider what is covered by that contract - it does not allow "any type" to "any type" conversions to be performed and still give a working program. Some situations can be directly caught and forbidden by the compiler, but other type cast situations will instead result in unexpected runtime behavior.

  • Your char array isn't even big enough.

    DataQueue c[4]; would be viable

    Either of the following

    unsigned char c[sizeof DataQueue];
    DataQueue c;
    

    might be closer to the OP's intent, though. Which of course leaves the glaring question what made him believe that he should use unstructured byte arrays to hold data structures, instead of just defining variables of the correct types. Odds are he's in the process of running off in entirely the wrong direction. Best we stop him early.

    Ultimately this triggers the same old rule I keep pointing out: whenever you feel the need to cast pointers in C code, odds are high you're doing something quite seriously wrong.

    There two ways of getting this wrong: either you're doing a cast that's not necessary at all, so you're just cluttering the source code. Or the cast can be plain wrong, because it overrides an important rule the compiler would have enforced otherwise, causing undefined behaviour of the resulting program. There are exceptions where a cast is both necessary and correct, but there's a good deal less of those than most newbies believe.

  • There were many casting structure(packed or unpacked) such as DataQueue in my project.
    what should i do to prevent those crashes?

    Get rid of all the "packed" attributes. They cause a whole lot of grief (including the unaligned accesses you're so worried about), for hardly any benefit.

  • If i used IAR compiler to compile this program, it's OK.

    No, it's not OK in any sensible meaning of the word.

    That program, as written, is terminally sick. The fact it doesn't die quite as horribly and immediately if built by a different compiler, or with different flags, doesn't make the program itself any healthier. It just means its manifest illness will express itself in a different way. Or do you really believe getting garbage results instead of an immediate hard fault is in any sense of the word "OK"?

    Every single line of that program which you commented as "It's OK" actually causes undefined behaviour. In case you don't recognize that term: that's the worst kind of error a C program can have while still being compilable at all.

    Your fixation on the instruction being used by the compiler is misleading you. It's not the compiler, nor the compiled code that's the problem here. It's the source code that's grossly incorrect.

  • Thanks a lot to Broeker and Westermark. I've learned so much from your replies.
    I was intent on writing a queue and source code is below.

    
    typedef struct
    {
            INT8U m_addr;
            INT8U m_length;
            INT8U m_buf[8];
    }QueueDataType;
    
    typedef struct {
            INT32U          m_front;
            INT32U          m_rear;
            INT16U          m_maxData;
    
            INT8U           (* ReadEmpty)();
            INT8U           (* WriteFull)();
            QUEUE_DATA_TYPE m_buf[1];
    } DataQueue;
    
    uint8_t QueueCreate(void *Buf,
                      uint32_t SizeOfBuf,
                      uint8_t (* ReadEmpty)(),
                      uint8_t (* WriteFull)()
                      )
    {
            DataQueue *Queue;
    
            if (Buf != NULL && SizeOfBuf >= (sizeof(DataQueue) + sizeof(QUEUE_DATA_TYPE)))
            {
                    Queue = (DataQueue *)Buf;
                    Queue->m_maxData = (SizeOfBuf - (uint32_t)sizeof(DataQueue)) / sizeof(QUEUE_DATA_TYPE);
                    Queue->m_front = 0;
                    Queue->m_rear = 0;
                    Queue->ReadEmpty = ReadEmpty;
                    Queue->WriteFull = WriteFull;
    
                    return QUEUE_OK;
            }
            else
            {
                    return NOT_OK;
            }
    }
    
    uint8_t QueueRead(QUEUE_DATA_TYPE *Ret, void *Buf)
    {
            uint8_t err;
            DataQueue *Queue;
    
            err = NOT_OK;
            if (Buf != NULL && Ret != NULL)
            {
                    Queue = (DataQueue *)Buf;
    
                    if (Queue->m_rear != Queue->m_front)
                    {
                            uint32_t front = (Queue->m_front + 1) % Queue->m_maxData;
    
                            *Ret = Queue->m_buf[front];
                            Queue->m_front = front;
    
                            err = QUEUE_OK;
                    }
                    else
                    {
                            err = QUEUE_EMPTY;
                            if (Queue->ReadEmpty != NULL)
                            {
                                    err = Queue->ReadEmpty(Ret, Queue);
                            }
                    }
            }
    
            return err;
    }
    
    uint8_t QueueWrite(void *Buf, QUEUE_DATA_TYPE* Data)
    {
            uint8_t err;
            DataQueue *Queue;
    
            err = NOT_OK;
            if (Buf != NULL && Data != NULL)
            {
                    uint32_t rear = 0;
    
                    Queue = (DataQueue *)Buf;
                    rear = (Queue->m_rear + 1) % Queue->m_maxData;
    
                    if (rear != Queue->m_front)
                    {
                            Queue->m_buf[rear] = *Data;
                            Queue->m_rear = rear;
    
                            err = QUEUE_OK;
                    }
                    else
                    {
                            err = QUEUE_FULL;
                            if (Queue->WriteFull != NULL)
                            {
                                    err = Queue->WriteFull(Queue, Data, Q_WRITE_MODE);
                            }
                    }
            }
    
            return err;
    }
    

    I thought that user can define an global buffer and use fucntion "QueueCreate" to initialize it.
    For example

    uint8_t gQueueBuffer[sizeof(DataQueue) + 20 * sizeof(QueueDataType)]
    
    void main()
    {
        if ( QueueCreate(gQueueBuffer, sizeof(gQueueBuffer), NULL, NULL) == QUEUE_OK )
        {
            // Create queue succ
        }
        else
        {
            // Create queue failed
        }
    }
    

    Both of you think it's a very bad way to cast a buffer pointer to a struct pointer.
    But in some cases, for example communications protocol,

    Offsets  0    1     2    3    4
             STX  ADDR  DAT  END  CRC
    

    I want to access data like this:

    typedef __packed struct
    {
        uint8_t STX;
        uint8_t ADDR;
        uint8_t DAT;
        uint8_t END;
        uint8_t CRC;
    }Cmd, *pCmd;
    
    void ProcessCmd(uint8_t* cmdBuf, uint16_t len)
    {
        pCmd p = (pCmd)cmdBuf;
    
        assert_param(len == sizeof(Cmd));
        if (p->STX == ..)
        {
            ....
        }
        ....
    }
    

    If I should not cast such buffer pointer to struct pointer, what should I do to access communication data?
    Or like

    if (cmdBuf[0] == ..)
    {
        ....
    }
    


    ?

  • I thought that user can define an global buffer and use fucntion "QueueCreate" to initialize it.

    And why did you think that was in any way preferrable over having the user actually defining a buffer of type DataQueue?

    Anyway, I hope you've now understood why that can't work. At the very least you'll have to ensure that buffer has sufficient alignment --- but unfortunately, there's not actually any good way you could make sure of that while letting the user allocate the storage and pass in a byte pointer. Either do your own allocating, or pick a pointer type with maximal alignment ... and good luck figuring out a portable way of determining what that might be.

    Ever wondered what that language about "sufficiently aligned for any data type" in the standard specification of malloc() is about? Well, you've just run into it.

    I want to access data like this:

    Just because you want to do something doesn't make it a good idea; it doesn't even make it possible. You have some reading-up to do on the topic of "serialization".

    This is the #1 train of thought that gets people to believe they should use "packed" all over their in their programs, like you do:
    typedef __packed struct
    And in the vast majority of cases, it's just plain dumb wrong.

  • Since you already have a fixed data type for the elements, you could do:

    enum {
        NUM_ENTRIES = 20,
        NUM_OTHER_ENTRIES = 40,
    };
    Queue my_queue;
    QueueEntry my_entries[NUM_ENTRIES];
    
    Queue my_other_queue;
    QueueEntry my_other_entries[NUM_OTHER_ENTRIES];
    
    void init_queue(Queue *q,QueueEntry entries[],unsigned num_entries) {
        QueueEntries *p;
        unsigned i;
    
        memset(q,0,sizeof(*q));
        q->first = NULL;
        q->last = NULL;
        q->num_entries = num_entries;
        // Chain together all entries in a list of unused.
        q->free = p = entries;
        for (i = 1; i < num_entries; i++,p++) {
            p->next = p+1;
        }
        p->next = NULL;
        ...
    }
    
    void main() {
        init_queue(&my_queue,my_entries,NUM_ENTRIES);
        init_queue(&my_other_queue,my_other_entries,NUM_OTHER_ENTRIES);
        ...
    }
    


    Two statically allocated queues of different length but with correct align.

  • Thanks to Hans-Bernhard Broeker and Per Westermark.
    I got it.