This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How can I get an efficient loop?

I have written the following code to perform a copy procedure between RAM and FLASH Memory:

unsigned char i;
i = 0x00;
do{
i --;
*(&FLASHADDRESS + j + i) = DataBuffer[i];
}while(i != 0x00);

Is there any better way to generate efficient(like DJNZ in assembler) code? :)
THS.

  • Yes. Reorganize your code to use:

    do {
         /* ... */
    } while (--i != 0x00);
    If you are concerned about optimization that level, a much better one would be to notice that 'j' is invariant throughout the loop's runtime and could be optimized out.

  • Is there any better way to generate efficient(like DJNZ in assembler) code?

    Generally, you shouldn't be trying to write your own memcpy() --- there's a nicely working one in the library, so use it.

    Second, if you're really all that worried about the performance, don't waste your time staring at C code --- inspect the generated assembly, or write the piece in assembly right away.

  • Flash doesn't generally let you simply write to an address. Typically, you have to go through a little routine making different patterns appear on the address and data bus to signal a "program" command, then follow that with the actual address and value to write, for the memory actually to change. memcpy() won't do the correct little dance, so it (or the loop shown) will have problems on most flash parts.

    Some parts support a bulk program command where you can supply many address/value pairs after doing the program command, but I'd still expect to see some more prep code before the actual write.

    As always, only your data sheet knows for sure.

  • Special thanks to Dan Henry, Hans-Bernhard Broeker and Drew Davis!

    Hi, Drew Davis. I have successfully wrote(program) my FLASH memory(ATMEL's AT29C040A) before. The upper code is one part of my program. "FLASHADDRESS" is the AT29C040A's base address, and "j" is the Sector(Page) address. I'm optimizing my code now.

    Hi, Dan Henry. Your code work well:)

    if you're really all that worried about the performance, don't waste your time staring at C code --- inspect the generated assembly, or write the piece in assembly right away.

    Thanks to Hans-Bernhard Broeker for this suggestion:)

  • I suppose my other thought is that there's little point in optimizing a flash write routine for speed. If you have to wait 150 usec between bytes anyway (plus 10 ms after the last byte), the speed of the loop itself won't matter much, unless you use a really slow CPU clock.

    Optimizing for space is another matter, of course.

  • Thanks to Drew Davis.

    If you have to wait 150 usec between bytes anyway (plus 10 ms after the last byte)

    ATMEL's data sheet says:
    The AT29C040A is reprogrammed on a sector basis. ... After the first data byte has been loaded into the device, successive bytes are entered in the same manner. Each new byte to be programmed must have its high to low transition on WE (or CE) within 150 us of the low to high transition of WE (or
    CE) of the preceding byte. If a high to low transition is not detected within 150 us of the last low to high transition, the load period will end and the internal programming period will start. A8 to A18 specify the sector address. The sector address must be valid during each high to low transition of WE (or CE). A0 to A7 specify the byte address within the sector. The bytes may be loaded in any order; sequential loading is not required.

    During the load period of that device, the DATA HOLD TIME is shorter than 1 usec. In addition, I use DATA POLLING feature. So I need not wait 10 msec at all.

  • The DJNZ instruction is used wherever possible in do...while, while, and for loops. So, there' is no code reorganization that is required.

    line level    source
    
       1          void main (void)
       2          {
       3   1      volatile unsigned char i;
       4   1
       5   1      while (--i != 0x00)
       6   1        {
       7   2        }
       8   1
       9   1      for (i=100; i; --i)
      10   1        {
      11   2        }
      12   1
      13   1      do
      14   1        {
      15   2        }
      16   1      while (--i);
      17   1      }
    
                 ; FUNCTION main (BEGIN)
                                               ; SOURCE LINE # 1
                                               ; SOURCE LINE # 2
    0000         ?C0001:
                                               ; SOURCE LINE # 5
    0000 D500FD      R     DJNZ    i,?C0001
                                               ; SOURCE LINE # 6
                                               ; SOURCE LINE # 7
    0003         ?C0002:
                                               ; SOURCE LINE # 9
    0003 750064      R     MOV     i,#064H
    0006         ?C0003:
                                               ; SOURCE LINE # 10
                                               ; SOURCE LINE # 11
    0006 D500FD      R     DJNZ    i,?C0003
                                               ; SOURCE LINE # 14
                                               ; SOURCE LINE # 15
    0009         ?C0006:
                                               ; SOURCE LINE # 16
    0009 D500FD      R     DJNZ    i,?C0006
                                               ; SOURCE LINE # 17
    000C 22                RET
                 ; FUNCTION main (END)
    

    Jon

  • Interesting. The interface for this part is a little unusual. Atmel generally sells flash with what I think of as "AMD-style" commands. It's not a standard, unfortunately, but a lot of manufacturers use something similar.

    The 55/AA/A0 sequence is usually the command to program a single word. For this part, they require you to program an entire 256-byte sector.

    It's a little more common to see an different sequence for a burst programming mode. See the 55/AA/80 55/AA/A0 command sequence for the Atmel AT49SV322, for example.

    I've found the single-word program command to be consistent on different parts from different manufacturers. The more advanced features, including burst-mode programming, tend to vary a bit more.

    Most flash devices have sector sizes larger than 256 bytes, as well. Those that do support small sector sizes seem more likely to have an internal RAM buffer to support this sort of burst mode programming.

    Using the advanced features often means getting trapped using a particular device, or a lot more software work to make a smart driver than can cope with many slightly different interfaces.

    At any rate, the burst mode methods do usually have a timeout between successive writes, so that they know when the processor is done with the sequence. This is the 150 us timeout after each write. That series of writes would be followed by the 10 ms delay (Twc) to allow the write actually to complete.

    The toggle bit or data polling algorithm lets you know when the flash is actually done updating. (You could just wait the worst-case 10 ms time instead.) The write isn't really complete until that algorithm is finished.