This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

App crashes without unused array declaration

I have an application that is running on an ARM9 and compiled with V3.60. It uses RL. I am having a strange problem. We are really close to the maximum amount of memory on our processor, within 2K. During development, I declared an unsigned char array of size 1056 with global scope in one of the libraries that handles data packets that are received.

#include <91x_conf.h>
#include <91x_lib.H>
#include <RTL.h>
#include <string.h>

U8 my_buffer[1056];

I changed my implementation later and no longer needed that array.

Now, here is the strange part.

Task1 cannot handle large packets without that array declaration being present. If I comment the array declaration out or cut the size in half, the first large packet (~1076 bytes) that comes in crashes the entire application, though it works as long as the packets received over Task1's interface are small. Another weird thing is that Task2 is unable to properly receive data if I leave the array declaration in my code, yet it works if I leave the declaration out.

Task1 and Task2 both claim between 2-3K of allocated blocks of memory on the heap when they start, and those blocks are not reallocated or deallocated until the board is rebooted.

Any ideas what is going on here?

Parents
  • I believe you all are right. Thank you for your quick responses. I am leaning toward the stack overflow idea.

    A particular branch of my code under a certain operation from Task1 that handles those large packets also saves them to a storage media. I commented out just the part that was saving them, and the data transfer itself causes no crash. The call to save the data resembles this:

    Mem_Write( &rx_pkt[ 9 ], storage_address, data_size );

    Where rx_pkt is the data, data_size is extracted from the packet, which contains the length of the data, and the algorithm sequentially saves the data along the storage media. The data_size is always 11 less than the size of the packet, as I control the format on both ends. It SHOULD be consistent, as a failed checksum on receipt means that the packet is just ignored.

    So, something is overflowing there in the function call, I think.

    It goes BOOM on the very first packet received if I comment out the array, but only after it sends a response back to the data sender. The overflow may actually be there because the response is placed on the stack too.

    That gives me a few places to start tracing. Thanks guys.

Reply
  • I believe you all are right. Thank you for your quick responses. I am leaning toward the stack overflow idea.

    A particular branch of my code under a certain operation from Task1 that handles those large packets also saves them to a storage media. I commented out just the part that was saving them, and the data transfer itself causes no crash. The call to save the data resembles this:

    Mem_Write( &rx_pkt[ 9 ], storage_address, data_size );

    Where rx_pkt is the data, data_size is extracted from the packet, which contains the length of the data, and the algorithm sequentially saves the data along the storage media. The data_size is always 11 less than the size of the packet, as I control the format on both ends. It SHOULD be consistent, as a failed checksum on receipt means that the packet is just ignored.

    So, something is overflowing there in the function call, I think.

    It goes BOOM on the very first packet received if I comment out the array, but only after it sends a response back to the data sender. The overflow may actually be there because the response is placed on the stack too.

    That gives me a few places to start tracing. Thanks guys.

Children
  • Well, I think I've figured it out. I'll explain so that maybe this helps someone.

    The next line after the Mem_Write call was:
    os_dly_wait(30);

    I commented that out too. If I uncommented only the os_dly_wait line, crashes occured again, whereas uncommenting Mem_Write and leaving os_dly_wait commented did not. What I think was happening is that the os_dly_wait call combined with a debug message that I was sending on the same interface and the response back to the sender were stacking up too much stuff, causing a stack overflow. The debug message and the response to the sender both use static arrays, but they have to control an OS_MUT with an infinite timeout to be sent. Removing the delay allowed the debug message and response to go out without a delay, thus preventing the overflow.

    So it was a combination of stack overflow and task switching that caused a crash, followed by a watchdog reset.

    Thanks all.

  • It's hard to tell without seeing the code, but it sounds as if the root cause is still there! Are you comfortable with a buffer overrun solved by a different timing...?

  • Not exactly, no. I plan to discuss ways to correct the root cause with our Manager of Software Development to make sure it is fully resolved.