This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Assembler Branch instruction?

b .+2
B is Branch instruction but I don not understant what is .+2 and how many cycle use this instruciton.
I was looking for a book, but we can not tell anything nicely, you can recommend something

  • Dot '.' means the current address, similar to $ in x86 (jmp $+2)

    "B ." would be an infinite loop.

    "B .+2" would be a near branch on a 16-bit Thumb ISA, to the next instruction, probably for alignment purposes.

    How many cycles? On what architecture?

    The Cortex-M3/4 processors have a DWT_CYCCNT register in many implementations for the exact purpose of tuning and benchmarking code as written.

    Consider also RAM vs FLASH, wait states, caching, and prefetch paths.

    Start with the ARM TRM for the core/architecture in question

  • Do not works, we are dealing with the same problem :-)
    What is my problem?
    For example and for clarification,
    We have simply C function , whitch include simpli inline asm with keyword volatile for disable optimization in asm code
    and include only three nop and one b .+2 .
    Function only make delays

    
    #include "stm32f1xx_hal.h"
    
    void test(void)
    {
    __asm volatile
      (
        "  nop \n\t"
        "  nop \n\t"
        "  nop \n\t"
        "  b .+2       \n\t"
      );
    }
    
    

    Where is the problem? Compiler return Internal fault: [0x03ab7c:5060422]

    *** Using Compiler 'V5.06 update 4 (build 422)', folder: 'C:\Keil\ARM\ARMCC\Bin'
    compiling test.c...
    Internal fault: [0x03ab7c:5060422]
    Please contact your supplier.
    "test.c" - 0 Error(s), 0 Warning(s).
    

  • I came up with this simple snippet:

    void __asm test(void)
    {
        nop
        nop
        nop
        b .+2
    }
    

    And also this:

    void test(void)
    {
      __asm
      {
        nop
        nop
        nop
        b Label
      Label:
      };
    }
    

    And both produced code as expected.

    Yes, It's absolutely amazing what can be achieved when you read the documentation.

  • Honestly, Branch instruction is used only for time consuption and right timing and may be replaced with NOP instruction.
    On the other hand b .+2 correct instruction and on gcc is translatable, on ARMcc have Internal fault: [0x03ab7c:5060422].

    last but not least, keyword volatile should disable optimization and use Slave translation., on gcc its working correctly , but on armcc is __asm volatile optimized by compliler and
    in

    __ASM volatile
            (
    "LSL i, #1  \n\t"
    "nop           \n\t"
    "nop        \n\t"
    "b label    \n\t"
    "label:     \n\t"
    );
    

    is translate as

    0x0800198A 4620      MOV      r0,r4
    0x0800198C EA4F0040  LSL      r0,r0,#1
    0x08001990 4604      MOV      r4,r0
    0x08001992 BF00      NOP
    0x08001994 BF00      NOP
    0x08001996 BF00      NOP
    


    I think it's a compiler error ,Apple would say the property :-)

  • I think it's a compiler error ,Apple would say the property :-)

    A compiler error is maybe not the best error message available, but if you choose not to give the compiler source code it understands, you really should expect problems.

    I've given you not one, but two examples of how to enter your code. I see the branch appear perfectly well, but if you have problems with the branch you could simply replace it with multiple nop instructions.

        test
            0x000000d4:    bf00        ..      NOP
            0x000000d6:    bf00        ..      NOP
            0x000000d8:    bf00        ..      NOP
            0x000000da:    e7ff        ..      B
    

    Before progressing, you should maybe think about what you're really trying to achieve. I would think working code and avoiding compiler errors would be quite a high priority, no?

  • Still sounds like some half-assed tuned loop for the output of GPIO "at some specific rate" where the model is to drive the processor into saturation. Something better addressed with a periodic timer and DMA, and a buffer large enough to decimate interrupt loading, so you can fill the buffer with new data while keeping it from being unduly massive.

    Is this something that can be solved with a CPLD rather than grinding a 72 MHz Cortex-M3 ?

    The compiler is not the place to write fine tuned assembler.

  • Agreed. Delays like this could (occasionally) be justified in the past on something like an 8051 with highly predictable instruction timing, but a Cortex, nah.