This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Problem in copying functions to RAM on ARM Cortex-M

I'm (again) facing a very strange problem in my project for ARM Cortex-M4 (STM32F301K8). The project requires some of the functions to be executed from RAM (it's actually a bootloader with encryption and option to self-update, but that doesn't matter here). In my startup code I have a loop that "initializes" blocks of data by copying them from flash to given address in RAM. The most common use of this code is to copy .data section and it works flawlessly, because it brain-dead simple.

In my linker script I have something like that:

          /* sub-section: data_array */

          . = ALIGN(4);
          __data_array_start = .;
          PROVIDE(__data_array_start = __data_array_start);

          LONG(LOADADDR(.data)); LONG(ADDR(.data)); LONG(ADDR(.data) + SIZEOF(.data));
          LONG(LOADADDR(.ram_text)); LONG(ADDR(.ram_text)); LONG(ADDR(.ram_text) + SIZEOF(.ram_text));

          . = ALIGN(4);
          __data_array_end = .;
          PROVIDE(__data_array_end = __data_array_end);

          /* end of sub-section: data_array */

Then in my startup code I have this code:

     // Initialize sections from data_array (including .data)
     ldr          r4, =__data_array_start
     ldr          r5, =__data_array_end

1:     cmp          r4, r5                                   // outer loop - addresses from data_array
     ittte     lo
     ldrlo     r1, [r4], #4                         // start of source address
     ldrlo     r2, [r4], #4                         // start of destination address
     ldrlo     r3, [r4], #4                         // end of destination address
     bhs          3f

2:     cmp          r2, r3                                   // inner loop - section initialization
     ittt     lo
     ldrlo     r0, [r1], #4
     strlo     r0, [r2], #4
     blo          2b

     b          1b                                        // go back to start

Now the problem I'm facing right now is that _ONE_ single word in RAM is not stored correctly... The problem is very strange, because when I have 0x00000000 in RAM and 0x12345678 is loaded in the register (r0 in my case) after the write I have 0x00005678 in RAM... Somehow only "half" of the data is written and the other half in RAM is not modified. This problem happens in the middle of the block - so it's not a problem of wrong range, all the data before and after that problematic spot are copied correctly. This problem happens in the same address (for example now that is 0x20000148), but from time to time the particular address changes. If I just move the block to some different address, the problem just moves to some different spot within this block. If I take another chip, the problem persists but on a different address.

As I wrote above, this is the second time I'm having this issue. Previously I've seen it on STM32F103 and nothing helped on the first day - copying with words, bytes, half-words, double-words, memcpy(). After I went to sleep without solving the issue, the next morning everything worked flawlessly ever since with absolutely no fix - identical code that didn't work on one day worked perfectly fine on the other day...

One guy suggested me that this may have something to do with the Flash Patch and Breakpoint unit in the core, but when I check it with the debugger I see that it is indeed enabled (0x261 in FP_CTRL register), but all the comparators are disabled (0 in FP_COMPx).

Anyone faced this issue and found a reliable solution? Thanks in advance for any hints!

Parents
  • I had a similar problem with my LPC1342, so I think it might not be unheard of. Similar but not identical; I think it was a problem with flashing the chip. Very few data (at random) went haywire.

    I remember that if I ran the microcontroller at a low speed, the problem went away, but as soon as I ran it at full speed, it went erratic (eg. Even if I flash-programmed it at a low speed).

    Looking at the code and comparing with the symptoms, it suggest that it's the *read* that goes wrong, not the write.

    Eg. somehow, it sounds like the source-pointer might 'jump' back or forward by 2.

    This could be caused by running the microcontroller at some high (overclocked) speed by accident.

    -So first thing: Try using OpenOCD and issue a few mdw commands to dump the RCC registers (something like 'mdw 0x40022000 40' will probably do fine), and then use the Reference Manual to find out what speed the MCU is actually running at. This is a much better approach than reading code, because you can look at the code over and over and never see the error.

    I think the first thing you might need to do is to check that the chip gets the power it needs.

    If it's a Discovery-board, then it probably does already, but if it's your own design, it's important to remember that something could have gone wrong (also from the PCB manufacturer's side).

    What I'm going to suggest is of course trivial (and probably a little annoying).

    Check that each of your 100nF VDD capacitors are soldered correctly.

    Check that there's a stable voltage on those pins.

    Now a bit worse: Make sure your external clock crystal's capacitors are correct.

    This may require some advanced equipment; if you have the equipment, then it's cool.

    If you don't, then the best bet will be to verify that there's no open connections between the XTAL pins and the crystal's terminals, plus that the capacitors are soldered correctly.

    Also the value of the capacitors would most likely be in the range 6pF to 10pF.

    If they're for instance 22pF, I'm pretty sure you'll need to re-calculate the values.

    -But instead of checking the crystal and capacitors, it might be a lot quicker to switch to using the internal oscillator, run at a low frequency and see if the problem persists.

    Please let me know about your findings.

Reply
  • I had a similar problem with my LPC1342, so I think it might not be unheard of. Similar but not identical; I think it was a problem with flashing the chip. Very few data (at random) went haywire.

    I remember that if I ran the microcontroller at a low speed, the problem went away, but as soon as I ran it at full speed, it went erratic (eg. Even if I flash-programmed it at a low speed).

    Looking at the code and comparing with the symptoms, it suggest that it's the *read* that goes wrong, not the write.

    Eg. somehow, it sounds like the source-pointer might 'jump' back or forward by 2.

    This could be caused by running the microcontroller at some high (overclocked) speed by accident.

    -So first thing: Try using OpenOCD and issue a few mdw commands to dump the RCC registers (something like 'mdw 0x40022000 40' will probably do fine), and then use the Reference Manual to find out what speed the MCU is actually running at. This is a much better approach than reading code, because you can look at the code over and over and never see the error.

    I think the first thing you might need to do is to check that the chip gets the power it needs.

    If it's a Discovery-board, then it probably does already, but if it's your own design, it's important to remember that something could have gone wrong (also from the PCB manufacturer's side).

    What I'm going to suggest is of course trivial (and probably a little annoying).

    Check that each of your 100nF VDD capacitors are soldered correctly.

    Check that there's a stable voltage on those pins.

    Now a bit worse: Make sure your external clock crystal's capacitors are correct.

    This may require some advanced equipment; if you have the equipment, then it's cool.

    If you don't, then the best bet will be to verify that there's no open connections between the XTAL pins and the crystal's terminals, plus that the capacitors are soldered correctly.

    Also the value of the capacitors would most likely be in the range 6pF to 10pF.

    If they're for instance 22pF, I'm pretty sure you'll need to re-calculate the values.

    -But instead of checking the crystal and capacitors, it might be a lot quicker to switch to using the internal oscillator, run at a low frequency and see if the problem persists.

    Please let me know about your findings.

Children