CTRLSTAT = 0xffffffff


I am working on Cortex-M4 over serial wire debug protocol, i am able to read IDCODE(0x2ba01477) which is correct as per arm cortex m4 technical reference manual, and i am writing 0x50000000 to CTRLSTAT register to enable CSYSPWRUPREQ and CDBGPWRREQ which is also success without any ACK error,but the thing is when i am trying to read the CTRLSTAT register i got 0xffffffff . I don't understand what might be the problem,Please suggest me what would be the problem??

Thanks & Regards

Parents Reply Children
  • sir,

    Thank you for responding, and by the way i am using stm32f407vg as host and stm32f407vg as target .i am trying to implement debug functionality so that i could access/control target from host. i put a step forward with Mark's Space: Programming internal SRAM over SWD blog.

    i am taken 2 gpio pins configuration as follow

    /* preparing GPIO pin to SWDIO */

    GPIO_PIN_DATA_t st_gSWDIOPin =









    /* Preparing GPIO pin to clock  */










    as a first step i send signals from host to target and able to read IDCODE,and even i am able to write CTRLSTAT register with value 0x50000000 but when i am trying to read CTRLSTAT .I got ACK success but while i am reading i got 0xffffffff,

    hoping thins information might helpful to you for my problem

    Thanks & Regards.

  • Unfortunately I do not have a specific answer, but I believe that I may be able to hint you towards solving the problem.

    First rule: Always assume that your debug-code is not 100% correct. It's really annoying to struggle with a problem for days, and finally discover that something was wrong with the debug-code - I've done that countless times.

    For STMicroelectronics's own documentation on the subject, please see RM0090, section 38.3.1.

    First of all, we'll need to make sure your connection is OK.

    When I connect a JTAG-Lock-pick Tiny 2 to my targets, I use the standard JTAG pull-up and pull-down resistors:

    • TCLK/SWDCLK: 10K pull-down
    • TMS/SWDIO: 10K pull-up
    • TDO/SWO: 10K pull-up

    ... Next, do you switch from output to input, when you're reading the CTRL/STAT register ?

    ... Also... The address; are you encoding it correctly, so (only) bits 3 and 4 in the first byte you send are modified ?

    ... If you have a logic-analyzer, it might be a good idea to connect it, in order to monitor the values sent and received on the pins.

    ... If you have a logic-analyzer, try monitoring the data that OpenOCD transfers to the target when it starts up, then compare to your own data.

    I know this sounds ridiculous, but it is a very good idea to write some low-level debug code, which shows you exactly what is going on.

    It will probably be a good idea to make a temporary modification to your transmit routine, the one routine, that sends a single bit:

    It should output either a 0 or a 1, depending on the data it sends.

    Then after each stream of bits, you send output a space.

    Finally, at the end of each transfer, you output a newline character.

    Transmit all the data from above to your debug-terminal via a UART interface, if you have a terminal connected to the board.

    Example of making the code fast (do not use printf in your bit-transmitter code):

    1: Create a static buffer and a static pointer variable outside your transmission function:

    static char sDebugBuffer[128];

    static char *pDebugBufffer = sDebugBuffer;

    2: Before starting each transmission, do the following:

    pDebugBuffer = sDebugBuffer;

    3: Each time you transmit a bit, do the following:

    *pDebugBuffer++ = '0' + bitValue;

    4: Each time you're done transmitting a sequence of bits (for instance 8 bits for a byte, or 3 bits for the response code, or 33 bits for the WDATA, do the following:

    *pDebugBuffer++ = 0x20;

    5: Each time a transfer ends, do the following:

    *pDebugBuffer++ = 0x0a;

    *pDebugBuffer = 0x00;

    pDebugBuffer = sDebugBuffer;

    /* and write the contents of the debug-buffer to your UART; maybe you're using UART_send, maybe write, maybe printf, let's assume printf in this case: */

    printf("%s", sDebugBuffer);

    This enables you to verify that your transmission is correct length, and it will also enable you to see exactly what bit-values are sent, so you can verify they're what you expect them to be.

    For the response, you can of course also implement a similar mechanism.

  • sir,

    Thanks for Quick Reply,in your reply i find SWO may i know is there any importance of SWO pin in serial wire debug protocol??

    Thanks & Regards.

  • SWO is not necessary when you're just programming the device.

    As far as I understand, Serial Wire Output is used for debug-information, however, I'm not really sure how this works (I still have to read up on that).

  • Sir,

    Thanks for Reply,

    i am following following sequnce on programming

    1->sending >50 clocks

    2->sending jtag to swd sequence 0xE79E(LSB first)

    3->sending >50 clocks

    4->sending idcode code packet 1|0|1|0|0|1|0|1   =  0xA5

    5->sending 1 clock pulse changing the direction waiting for ack

    6->collecting 3 bits by sending 3 clock pulses

    7->after success ack sending 33 clock pulses 32(data)+1(parity)

    target : 0x2ba01477

    8->sending 1 clock pulse turnaround period

    9->setting CTRLSTAT register CSYSPWRUPREQ and CDBGPWRUPREQ bits by sending packet 1|0|0|1|0|1|0|1  =  0x95

    and data = 0x50000000

    target : (ACK success)

    10->reading CTRLSTAT register by sending packet 1|0|1|1|0|0|0|1   =  0xB1

    11->after success ack sending 33 clock pulses 32(data)+1(parity)

    target : 0xffffffff

    Thanks & Regards.

  • Very good information, indeed.

    9->setting CTRLSTAT register CSYSPWRUPREQ and CDBGPWRUPREQ bits by sending packet 1|0|0|1|0|1|0|1  =  0x95

    To me it looks like you're writing to the SELECT register instead of the CTRL/STAT register here.

    Try sending the following bit-sequence instead... 1|0|0|0|1|1|0|1 = 0x8B...

    10->reading CTRLSTAT register by sending packet 1|0|1|1|0|0|0|1  =  0xB1

    To me it looks like you're requesting the SELECT register instead of the CTRL/STAT register here.

    What happens if you send the following bit-sequence instead... 1|0|1|0|1|0|0|1 = 0xA9 ?

  • Sir,

    Thank you for replay,

    sorry in the line 4->i typed wrong but i am sending 0xA5.

    i send your sequence 0x8B for writing into CTRLSTAT register i got fault response from target

    i send 0xA9 to read from CTRLSTAT register i got ACK fault response for both

    in previous my post

    packet frame i followed as      Startbit | APnDP | R/W | A[2 : 3] | Parity | Stop | Perk |

    To write CTRLSTAT

    startbit  = 1

    APnDP = 0

    R/W     = 0

    from Implementation of Serial Wire JTAG flash programming in ARM Cortex M3 Processors | Theesan's log  "Table 2.2.1 DP Register information" he given in 3:2 order i am taken in 2:3 order so

    a[2:3] = 1 | 0

    Parity = 1

    Stop = 0

    Perk = 1

    in hex 0x95

    To Read  CTRLSTAT 

    startbit  = 1

    APnDP = 0

    R/W     = 1

    from Implementation of Serial Wire JTAG flash programming in ARM Cortex M3 Processors | Theesan's log Table 2.2.1 DP Register information he given in 3:2 order i am taken in 2:3 order so

    a[2:3] = 1 | 0

    Parity = 0

    Stop = 0

    Perk = 1

    in hex B1

  • It seems you're right. This documentation is absolutely confusing.

    In section 38.6.1 of the RM0090, the first bit is bit 31 and the last bit is bit 0 (no, they're not related to SWD).

    But in section 38.8.2, table 295, it seems that the first bit is bit 0 and the last bit is bit 7.

    That means: If you send a word via SWD, I understand it as that you will have to send the least significant bit first.

    Hmm, 0x95 and 0xA9 are the bits 'mirrored', so I guess it seems you're sending the correct data.

    Hmm, it looks like OpenOCD does not agree entirely with Mark's JTAG-to-SWD sequence.

    Taken from OpenOCD's swd.h:


    * JTAG-to-SWD sequence.


    * The JTAG-to-SWD sequence is at least 50 TCK/SWCLK cycles with TMS/SWDIO

    * high, putting either interface logic into reset state, followed by a

    * specific 16-bit sequence and finally a line reset in case the SWJ-DP was

    * already in SWD mode.


    static const uint8_t swd_seq_jtag_to_swd[] = {

      0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7b, 0x9e,

      0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x0f,


    If your device is not in SWD state, then it could perhaps be why things go wrong.

  • Sir,

    Thank you for reply,

    hardware settings taken from below url :

    HardwareDebugConnections - ** Code Red Support Site **

    from the above url i connected host GPIO to SWCLK pin of target with 10k pull down, and host GPIO to SWDIO pin of target with 10K pullup. is there any wrong with hardware connection??

  • I think your connection is probably correct.

    Do you have a logic analyzer ?

  • thank you for reply,

    no i don't have any logic analyzer financially i am unable to purchase logic analyzer , is there any alternative software tool to simulate hardware?

    Thanks & Regards.

  • sir,

    One thing i forgot to say when i am reading ctrlstat register it's ack response is correct, and the result is  0xffffffff with "Parity error".

    Thanks & Regards.

  • If I'm lucky, I can set up OpenOCD with my board and snoop on an external JTAG programmer, but it will require some time, before I can get this fully set up, as I don't have all the resources I need at hand.

    I've also had a look in OpenOCD's code, and it looks to me that it's doing the things in the same way (or very close to, at least) that you are doing them.

  • Sir,

    How much delay is required while generating a clock (high   delay    low)???

  • The timing information can be found on the ARM Information Center under SWD timing requirements.

    Each edge should be between 10 ns and 500 us.

    That means ... If you're running at 168MHz: at least 2 clock cycles (one for writing, one for delay), maximum 84000 clock cycles.

    I think you can use something like this, in order to get the fastest possible transfer rate at 168MHz:

    #define SW_NOP      __NOP()

    #define SWDIO_HI    /* replace this comment with a way to set the SWDIO pin high */

    #define SWDIO_LO    /* replace this comment with a way to set the SWDIO pin low */

    #define SWDIO_IN    /* replace this comment with a way to read the SWDIO pin; the value should be either 1 or 0 */

    #define SWDIO_(b)   if(b){ SWDIO_HI; } else { SWDIO_LO; }

    #define WR(b)       SWDIO_(b) ; SWCLK_LO ; SW_NOP         ; SWCLK_HI

    #define WR1         SWDIO_HI  ; SWCLK_LO ; SW_NOP         ; SWCLK_HI

    #define WR0         SWDIO_LO  ; SWCLK_LO ; SW_NOP         ; SWCLK_HI

    #define RD          SW_NOP    ; SWCLK_LO ; bit = SWDIO_IN ; SW_CLK_HI

    #define CLK         SW_NOP    ; SWCLK_LO ; SW_NOP         ; SWCLK_HI

    The CLK macro can be used to repeat a bit; eg. if you know you're outputting 8 zero bits, you can issue WR0; CLK; CLK; CLK; CLK; CLK; CLK; CLK.

    Note: The order of reading and writing must match exactly, and the high and low states on the clock must also be exact.

    (That does not mean that I've written the above code correctly, though).

    The WR1 would take one clock cycle, if it's translated into a STR instruction without loading register values first.

    The WR0 would also take one clock cycle (same case).

    The RD might take two or three clock cycles, because it's often necessary to perform a bitwise AND operation before using a bitwise OR operation to insert the new bit into a register.

    The assembly code would usually look like this:

                        ldr                 r2,[r3]             /* [2] read the pin state */

                        and                 r2,r1,r2,lsr#4      /* [1] move bit 4 to bit 0 and isolate it, so it's now either 0 or 1 /*

                        add                 r0,r2,r0,ror#1      /* [1] shift the result right by 1, then insert the new bit */

    Using the above mehtod means that after receiving all the bits, you'd need to rotate the entire result left by the number of bits received -  1.

    Note: on Cortex-M4, the BFX instruction can also be used for extracting the bit.

    -But in some cases, you might be able to read from a port, where the bit value will always be 1 or 0.