Sir,
I am working on Cortex-M4 over serial wire debug protocol, i am able to read IDCODE(0x2ba01477) which is correct as per arm cortex m4 technical reference manual, and i am writing 0x50000000 to CTRLSTAT register to enable CSYSPWRUPREQ and CDBGPWRREQ which is also success without any ACK error,but the thing is when i am trying to read the CTRLSTAT register i got 0xffffffff . I don't understand what might be the problem,Please suggest me what would be the problem??
Thanks & Regards
Thank you for replay,
sorry in the line 4->i typed wrong but i am sending 0xA5.
i send your sequence 0x8B for writing into CTRLSTAT register i got fault response from target
i send 0xA9 to read from CTRLSTAT register i got ACK fault response for both
in previous my post
packet frame i followed as Startbit | APnDP | R/W | A[2 : 3] | Parity | Stop | Perk |
To write CTRLSTAT
startbit = 1
APnDP = 0
R/W = 0
from Implementation of Serial Wire JTAG flash programming in ARM Cortex M3 Processors | Theesan's log "Table 2.2.1 DP Register information" he given in 3:2 order i am taken in 2:3 order so
a[2:3] = 1 | 0
Parity = 1
Stop = 0
Perk = 1
in hex 0x95
To Read CTRLSTAT
R/W = 1
from Implementation of Serial Wire JTAG flash programming in ARM Cortex M3 Processors | Theesan's log Table 2.2.1 DP Register information he given in 3:2 order i am taken in 2:3 order so
Parity = 0
in hex B1
It seems you're right. This documentation is absolutely confusing.
In section 38.6.1 of the RM0090, the first bit is bit 31 and the last bit is bit 0 (no, they're not related to SWD).
But in section 38.8.2, table 295, it seems that the first bit is bit 0 and the last bit is bit 7.
That means: If you send a word via SWD, I understand it as that you will have to send the least significant bit first.
Hmm, 0x95 and 0xA9 are the bits 'mirrored', so I guess it seems you're sending the correct data.
Hmm, it looks like OpenOCD does not agree entirely with Mark's JTAG-to-SWD sequence.
Taken from OpenOCD's swd.h:
/**
* JTAG-to-SWD sequence.
*
* The JTAG-to-SWD sequence is at least 50 TCK/SWCLK cycles with TMS/SWDIO
* high, putting either interface logic into reset state, followed by a
* specific 16-bit sequence and finally a line reset in case the SWJ-DP was
* already in SWD mode.
*/
static const uint8_t swd_seq_jtag_to_swd[] = {
0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7b, 0x9e,
0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x0f,
};
If your device is not in SWD state, then it could perhaps be why things go wrong.
Thank you for reply,
hardware settings taken from below url :
HardwareDebugConnections - ** Code Red Support Site **
from the above url i connected host GPIO to SWCLK pin of target with 10k pull down, and host GPIO to SWDIO pin of target with 10K pullup. is there any wrong with hardware connection??
I think your connection is probably correct.
Do you have a logic analyzer ?
thank you for reply,
no i don't have any logic analyzer financially i am unable to purchase logic analyzer , is there any alternative software tool to simulate hardware?
Thanks & Regards.
sir,
One thing i forgot to say when i am reading ctrlstat register it's ack response is correct, and the result is 0xffffffff with "Parity error".
If I'm lucky, I can set up OpenOCD with my board and snoop on an external JTAG programmer, but it will require some time, before I can get this fully set up, as I don't have all the resources I need at hand.
I've also had a look in OpenOCD's code, and it looks to me that it's doing the things in the same way (or very close to, at least) that you are doing them.
How much delay is required while generating a clock (high delay low)???
The timing information can be found on the ARM Information Center under SWD timing requirements.
Each edge should be between 10 ns and 500 us.
That means ... If you're running at 168MHz: at least 2 clock cycles (one for writing, one for delay), maximum 84000 clock cycles.
I think you can use something like this, in order to get the fastest possible transfer rate at 168MHz:
#define SW_NOP __NOP()
#define SWDIO_HI /* replace this comment with a way to set the SWDIO pin high */
#define SWDIO_LO /* replace this comment with a way to set the SWDIO pin low */
#define SWDIO_IN /* replace this comment with a way to read the SWDIO pin; the value should be either 1 or 0 */
#define SWDIO_(b) if(b){ SWDIO_HI; } else { SWDIO_LO; }
#define WR(b) SWDIO_(b) ; SWCLK_LO ; SW_NOP ; SWCLK_HI
#define WR1 SWDIO_HI ; SWCLK_LO ; SW_NOP ; SWCLK_HI
#define WR0 SWDIO_LO ; SWCLK_LO ; SW_NOP ; SWCLK_HI
#define RD SW_NOP ; SWCLK_LO ; bit = SWDIO_IN ; SW_CLK_HI
#define CLK SW_NOP ; SWCLK_LO ; SW_NOP ; SWCLK_HI
The CLK macro can be used to repeat a bit; eg. if you know you're outputting 8 zero bits, you can issue WR0; CLK; CLK; CLK; CLK; CLK; CLK; CLK.
Note: The order of reading and writing must match exactly, and the high and low states on the clock must also be exact.
(That does not mean that I've written the above code correctly, though).
The WR1 would take one clock cycle, if it's translated into a STR instruction without loading register values first.
The WR0 would also take one clock cycle (same case).
The RD might take two or three clock cycles, because it's often necessary to perform a bitwise AND operation before using a bitwise OR operation to insert the new bit into a register.
The assembly code would usually look like this:
ldr r2,[r3] /* [2] read the pin state */
and r2,r1,r2,lsr#4 /* [1] move bit 4 to bit 0 and isolate it, so it's now either 0 or 1 /*
add r0,r2,r0,ror#1 /* [1] shift the result right by 1, then insert the new bit */
Using the above mehtod means that after receiving all the bits, you'd need to rotate the entire result left by the number of bits received - 1.
Note: on Cortex-M4, the BFX instruction can also be used for extracting the bit.
-But in some cases, you might be able to read from a port, where the bit value will always be 1 or 0.
Thank you for code snippet sir,
i followed ur sequence , but the story remains same.
IDCODE success,
ABORT write success,
CTRLSTAT = 0xffffffff with parity error ,
i got same response even i changed pullup 10k to 100k.
... Uhm wait ... You use STM32F4 Discovery, right ?
If so, did you remember to disconnect the on-board SWD programmer ?
-If not, it might interfere with your results.
I tried grabbing the data using my analyzer, and I can't seem to map them properly to the documentation and experience of for instance Mark.
These are the first 400 bits of the transfer:
11111111111111111111111111111111111111111111111111111011110011110011111111111111
11111111111111111111111111111111111111111100101001011001110111000101000000001011
10101000011011000110000000000000000000000000000000100111100101011000100000000000
00000000000000000101001011000110000000000000000000000000000001111001100011011000
10000111100000000000000000000000001111100110000000000000000000000000000000000001
Note: When using HLA_SWD on STM32F4, I get 53 leading ones and 53 trailing ones, but when I use JTAG-lock-pick Tiny 2 with LPC1751, I get 50 leading ones and 50 trailing ones.
What looks odd to me, is that it looks like there is no turnaround!!
-But also according to the documentation found here, there should be turnaround: http://www.arm.com/files/pdf/Low_Pin-Count_Debug_Interfaces_for_Multi-device_Systems.pdf
I get IDCODE 0x2BA01477 for both STM32F4 (which is a Cortex-M4 device) and LPC1751 (which is a Cortex-M3 device).
Here's a similar session with JTAG-lock-pick Tiny 2 and LPC1751:
11111111111111111111111111111111111111111111111111011110011110011111111111111111
11111111111111111111111111111111111100101001011001110111000101000000001011101010
00111000000110011011110000000000000000000000000000000000001011000110010000010000
00000000000000000111101110010101100110000010000000000000000000000000011011000110
00000001000000000000000000000000011110010101100110000000000000000000000000000101
In both cases, it seems there's no turnaround clock. So I believe that right after sending the last bit, before changing the SWCLK to LOW, you could try and change the SWDIO direction to input, then issue the SWCLK_LO and have 0 turn-around clock cycles.
According to all the documents I found, it seems that data are sampled on the rising edge of SWCLK, thus changing the direction should be done right after you've sampled the data.
So far, I have been unable to do a test with the JTAG-lock-pick Tiny 2 connected to the Discovery board, but if I get it set up, I'll post the output here as well.
thank you for reply sir,
i got the CTRLSTAT register correct , The problem was with the parity bit calculation , I corrected that one. Now I am able to halt and reset the core from host to target successfully. Now i am trying to write data into flash memory region. may i know an algorithm or flowchart approach to write hex file to flash memory??
Thank you sir,
Sir I am able to write data to core register like r0,r1,r2........ but when I am trying to read it back, i got data as 0, my reading and writing procedure as following
writing
selecting bank0
writing data to DCRDR
writing address to DCRSR
waiting until S_REGRDY bit in DHCSR register was set
reading
reading data from DCRDR
my approach
core halt & reset
write into register
read from register (where i wrote)
It looks like you've come a long way.
I think, that at this time you may know much more about all this than I do.
-So I think we'll need to call for a SWD-expert in this case.
,
after writing into flash if i am trying to reading from the flash address i got 0xffffffff i don't know why that happening ??
Thanks for Reply sir,
in above post you wrote "0x40023c04=0x45670123. 0x4002c304=0xcdef89ab" hear both addresses are different , in stm32f4 reference manual 3.6.1 he mentioned that both keys 0x45670123 and 0xcdef89ab should be send to FLASH_KEYR(0x40023C04) register please make a clarity on that and another thing is my flash Unlock was unsuccess full.
my procedure was
halt
reset
reading status of FLASH_CR register if LOCK bit is set then
sending 0x45670123 and 0xcdef89ab to FLASH_KEYR register.
delay: for loop ->1000
then reading status of FLASH_CR register.
Thanks and Regards.