Hi, I currently encounter problems with malloc. I use microlib and configured the heap to be 32MBytes (0x02000000 bytes). Under some special circumstances (depending on the image date we get from a CMOS) malloc fails somwhere in my image analysis when trying to allocate memory of e.g. 0x1000 bytes. I tracked the total amount of allocated memory and found out, that only 0x19000 bytes has been allocated until the call of malloc. After each call to the image analysis functions, all allocated memory is freed, thus it should not be caused by fragmentation.
How can it be possible, that malloc fails? May it be possible - because of a programming error -, that overwriting some data malloc uses to determine the free amount of memory, will cause such problems?
Where can I get informations on how microlib implements malloc?
Hello Stefan,
It is hard to tell what is wrong (as usual; I just found a new way kill RTX...!). But why do you use the heap in the first place? Why not use a static buffer mapped to the same region instead?
As mentioned, most times when dynamic memory is considered, there are alternatives that doesn't make use of a heap.
Do you have more than one thread that performs allocations? May there be a problem with a resource lock either blocking malloc, or failing to block malloc or free, resulting in corrupted data structures?
Tamir,
I am very curious to hear the new way that you are killing RTX. I too am having a problem:
I just upgraded everything to 4.0 (RL-ARM & RVMDK), and now my software interrupt functions are not leaving supervisor mode upon returning. (ie I call a swi function, when it returns I expect it to be in user mode, but it is actually still in supervisor). This did not happen in 3.80. I am still looking into it, and if I can't figure out something stupid that I have done I will be creating a formal post.
-Eric
P.S. Sorry to take this thread off topic...
Hello Eric,
Thanks for the information. The latest release of our software dies when polled via the USB CDC connection. The differences between the last two released are minor: I need to test that tomorrow (i.e I will update this thread, of course) but I suspect that the cause is a replacement of a polling based code with RTX event driven code, that causes my program counter to end up executing "instructions" from the RTX buffer 'mp_stk' in internal RAM (that happens at arbitrary moments)! I did not upgrade to 4.0 yet - but I guess the RTX timing problem is solved there. Do you always experience the failure you described or does it occur at arbitrary moments? I think that I will first solidify my suspicions before upgrading - ache, I wish I had a Cortex M3 or better plus a boss that is willing to pay 16000 dollars for an ARM profiler...!
just for everyone's information: I have seen the ARM profiler in action during a seminar and I have to tell you: it is an absolutely stunning piece of hardware/software. buy it, if you have the money...
Ah, good luck with your USB woes!
I posted a thread with my SWI issues here: http://www.keil.com/forum/docs/thread15695.asp#msg79346
I'd love to hear your thoughts on this.
Eric,
I can confirm that replacing RTX driven even handling with the previous polling based code solved the above problem of the PC ending up in internal RAM. Later today I will report this issue with differences of the implicated files. I really did not do anything out of the ordinary - and this is not a wrong RTX API issue, as far as I can tell.
I meant: ..."and this is not a wrong RTX API usage issue..."
Thanks for the update, Tamir. This is concerning to hear -- I hope they can reproduce your problem and fix it!
I have found the problem. I had a static variable which has not been reinitialized on successive calls, thus it increments with every call and since it has been used for accessing an array, I wrote to memory where I should not write...
But now I encounter other problems: When calling a function with 7 parameters, the last 2 of them were different, when the function is entered.
unsigned char nameOfFunction ( const Type_t1 *p_para1, const int para2, const float para3, const float para4, const int para5, int *para6, int *para7 )
in the calling function, para5 is 20 and para6 and para7 point to memory in RAM (something behind 0x20000000). Inside the function, para6 and para7 get the value 20 (0x14) like para5 and thus the function fails.
How may this happen? I use RTX. Compiled with optimization level 0.
It is't clear what you mean.
Are the parameters (pointers) "para6" and "para7" getting incorrect values, i.e. pointing at the wrong locations.
Or are "para6" and "para7" pointing to the correct locations, but the memory locations have been changed?
We would really need to see both the code that makes use of "para6" and "para7" and the code that calls the function (after having made sure that the last two parameters do point to valid memory and that the memory do contain the expected values) to be able to be pout.
The memory they are pointing to is changed. And there is no other code executed than the code to call the function.
I just made a test, I put all the parameters into a struct and the behavior is quite curious.
typedef struct { Type_t1 *p_para1; int para2; float para3; float para4; int para5; int *para6; int *para7; } function_params_t; /* inside the calling function */ function_params_t params; params.p_para1 = p_para1; params.para2 = para2; params.para3 = para3; params.para4 = para4; params.para5 = para5; params.para6 = para6; params.para7 = para7;
When this code is executed, the address where params is located (¶ms) changes! What is happening?
If the variable "params" is a global variable, it should always have a fixed address.
If the variable "params" is an auto variable, the address will depend on the current stack depth when reaching the function that contains the variable.
But the variable should not move if you continue deeper in your call tree.
How do you deduce that the variable moves? Looking at a named variable in the debugger? Looking at the contents of an absolute address in the debugger? Some other method? Note that auto variables are often problematic for a debugger when you continue to call new functions, since the currrent stack frame gets replaced with the new stack frame for the called function. But looking at a memory dump for an absolute address should still show the contents of the "params" struct, unless "params" happens to be located in a position where it may either be overwritten by a DMA transfer, or may be damaged by a stack overflow.
I think the stack gets corrupted. When I make the variable global, this problem disappears - and others come up...
Something else to consider is the implications of having assembler-written ISR that fails to save one or more registers. Depending on what registers that gets trashed in the ISR, large parts of the main application can continue to work.
A x86 processor has quite few registers, so a corrupted register quickly kills a program. The ARM has a lot of registers so in some situations you can have large code blocks that rely on the contents of a specific register.