Hello
I'm working with the AT91RM9200 from Atmel and I have some problems with data abort. The stack size is big enough. I tested it by making the user stack size zero and see which errors now occured. Everything was overwritten by other things. At the moment my stack size is 0x00000200. The error which occured is that the ptr to the usart base address is overwritten either if the usart.o is stored in the internal ram or the external ram. I use the watch window to see if the ptr would change his value - but nothing happens. I only know that the value is changed because I set a dummy within the retarget function for printf() - (thanks for the hint Per).
The whole programm only includes the ethernet init part, the usart and retarget function in separate c-files. So I have one main file with only the main routine where I init all other functions by calling the right function. In the while loop (main.c) I only poll if a flag is set by receiving a new frame by the ethernet. This works with the data abort problems.
These problems don't appear always at the same part of code. Is my acceptance right, that if I would have a stack size problem, the error would occurred always at the same part of code?
When I test the whole programm several times, I have one or two passes where no error occured. Very often the error happened bevor I call the init function for the ethernet part. So I don't think there would be a error during the execution of the code. I think the error could / must be very global. If I don't include the ethernet part, then I got no data abort error. But I'm not able to find the / or these problems.
the headerfile of the ethernet part:
#define AT91C_EMAC_TDLIST_BASE 0x21000000 /* */ #define ETH_PACKET_SIZE 1536 /* greater than 1518 or 1522 (VLAN) bytes */ #define NB_ETH_RX_PACKETS 10 /* number of RX buffer */ #define EMAC_RXBUF_ADD_WRAP 0x02 /* WRAP Bit at the end of the list descriptor */ #define MII_STS_REG 0x01 #define MII_STS2_REG 0x11 #define RxPacket (AT91C_EMAC_TDLIST_BASE + (8 * NB_ETH_RX_PACKETS)) typedef struct { unsigned int RxBufAddr; unsigned int RxBufStatus; } *EMAC_pRX_descriptor; void Phy_Init(void);
and the c-file - maybe someone of you will find the problem.
char localMACAddr[6] = {0x00, 0x05, 0xD2, 0xFA, 0xB0, 0x01}; EMAC_pRX_descriptor p_rxBD = (EMAC_pRX_descriptor)AT91C_EMAC_TDLIST_BASE; void ethernet_interrupt(void) __irq { unsigned int status; AT91PS_EMAC pEmac = AT91C_BASE_EMAC; // clear receive status register status = pEmac->EMAC_ISR; if (status & AT91C_EMAC_RCOM) // Receive complete Emac_Receive=1; if(status & AT91C_EMAC_ROVR) overrun=1; AT91C_BASE_AIC->AIC_EOICR = 0; // End of ISR } unsigned short AT91F_MII_ReadPhy(AT91PS_EMAC pEmac, unsigned char addr) { unsigned int value = 0x60020000 | (addr << 18); pEmac->EMAC_MAN = value; // Wait until IDLE Bit in Network Status Register is cleared while (!(pEmac->EMAC_SR & AT91C_EMAC_IDLE)); return (pEmac->EMAC_MAN & 0x0000ffff); } int MII_GetLinkSpeed(AT91PS_EMAC pEmac) { unsigned short stat1, stat2; stat1 = AT91F_MII_ReadPhy(pEmac, MII_STS_REG); stat1 = AT91F_MII_ReadPhy(pEmac, MII_STS_REG); stat2 = AT91F_MII_ReadPhy(pEmac, MII_STS2_REG); //set MII for 100BaseTX and Full Duplex if ((stat1 & 0x4000) && (stat2 & 0x8000) ) pEmac->EMAC_CFG |= (AT91C_EMAC_SPD| AT91C_EMAC_FD); //set MII for 10BaseT and Full Duplex else if ((stat1 & 0x1000) && (stat2 & 0x2000)) pEmac->EMAC_CFG = (pEmac->EMAC_CFG & AT91C_EMAC_SPD) | AT91C_EMAC_FD; //set MII for 100BaseTX and Half Duplex else if ((stat1 & 0x2000) && (stat2 & 0x4000)) pEmac->EMAC_CFG = (pEmac->EMAC_CFG | AT91C_EMAC_SPD ) & ~AT91C_EMAC_FD; //set MII for 10BaseT and Half Duplex else if ((stat1 & 0x0800) && (stat2 & 0x1000)) pEmac->EMAC_CFG = pEmac->EMAC_CFG & ~AT91C_EMAC_SPD & ~AT91C_EMAC_FD; return 0; } int MDIO_StartupPhy(AT91PS_EMAC pEmac) { int ret; //enable Management MDIO Port to read/write into the registers of the PHY pEmac->EMAC_CTL |= AT91C_EMAC_MPE; ret = MII_GetLinkSpeed(pEmac); ret = ret; //disable mdo pEmac->EMAC_CTL &= ~AT91C_EMAC_MPE; return 0; } void Phy_Init(void) { unsigned int i=0, status=0; char *pRxPacket =(char *)RxPacket; AT91PS_EMAC pEmac = AT91C_BASE_EMAC; for(i=0; i< NB_ETH_RX_PACKETS; i++) { p_rxBD[i].RxBufAddr = (unsigned int)pRxPacket; p_rxBD[i].RxBufStatus = 0; pRxPacket +=ETH_PACKET_SIZE; } //set wrap bit at the end of the list descriptor p_rxBD[NB_ETH_RX_PACKETS - 1].RxBufAddr |= EMAC_RXBUF_ADD_WRAP; //enable pmc AT91F_EMAC_CfgPMC(); //enable MII ethernet pins peripheral AT91F_EMAC_CfgPIO(); status = MDIO_StartupPhy(pEmac); status = status; // program the hardware MAC Address // the sequence write EMAC_SA1L and write EMAC_SA1H must be respected pEmac->EMAC_SA1L = ((int)localMACAddr[2] << 24) | ((int)localMACAddr[3] << 16) | ((int)localMACAddr[4] << 8) | localMACAddr[5]; pEmac->EMAC_SA1H = ((int)localMACAddr[0] << 8) | localMACAddr[1]; //receive buffer queue pointer pEmac->EMAC_RBQP = (unsigned int) p_rxBD; // clear status register pEmac->EMAC_CTL = 0; // RMII enable pEmac->EMAC_CFG = (pEmac->EMAC_CFG & ~(AT91C_EMAC_CLK)) | AT91C_EMAC_CLK_HCLK_32 | AT91C_EMAC_CAF | AT91C_EMAC_RMII |AT91C_EMAC_BIG; //clear receive status register (overrun, received frames e.g) pEmac->EMAC_RSR &= ~(AT91C_EMAC_OVR | AT91C_EMAC_REC | AT91C_EMAC_BNA); // enable MAC interrupts (kein AT91C_EMAC_DONE) pEmac->EMAC_IER |= (AT91C_EMAC_RCOM | AT91C_EMAC_RBNA | AT91C_EMAC_OVR | AT91C_EMAC_TUND | AT91C_EMAC_RTRY | AT91C_EMAC_TCOM | AT91C_EMAC_ROVR | AT91C_EMAC_HRESP | AT91C_EMAC_LINK); //AIC interrupts AT91F_AIC_ConfigureIt(AT91C_BASE_AIC, AT91C_ID_EMAC, AT91C_AIC_PRIOR_HIGHEST - 3, AT91C_AIC_SRCTYPE_EXT_HIGH_LEVEL, (void (*) (void))ethernet_interrupt); AT91F_AIC_EnableIt( AT91C_BASE_AIC, AT91C_ID_EMAC); //enable transmit and receive pEmac->EMAC_CTL |= (AT91C_EMAC_TE | AT91C_EMAC_RE); }
I install the startup in the external flash memory
*.o (RESET, +First) *(InRoot$$Sections) .ANY (+RO)
the RW and ZI data from the ethernet part is stored in the external RAM and all other data are stored in the internal RAM
.ANY(+RW +ZI)
It would be very nice, if you could help me with my big problems..
Johannes
I made a big step - I have erased the whole code in the phy_init.c file until one global variable and one function where I changed the value of this global variable.
From the main.c routine I called this function and get the same error. So the code must be right, the stack size is big enough, the only thing which is different is that this c-file will be stored in the external sdram.
If I take a look into the map file everything looks good to me.
Execution Region SDRAM (Base: 0x20000000, Size: 0x00000004, Max: 0x03ffffff, ABSOLUTE) Base Addr Size Type Attr Idx E Section Name Object 0x20000000 0x00000004 Data RW 88 .data phy_init.o
What could be the problem - hardware problem with the sdram? or the wrong frequency or 'Number of Wait states?
Which values do I have to check to solve this problem?
First, you could check the abort status. See 16.3.4. in the chip documentation (that's on page 124 in the copy I have). It will tell you what address the MCU was trying to access when the abort occurred.
How can I do that, because if a data abort occured, then I'm not able to do anything...
And after I restart the processor both registers are 0x00000000.
ok my ASR register is: 0x00010202 and my AASR register is 0x10000b69
when I look into the map file then I found:
local symbols:
_printf_core 0x100008bd Thumb Code 1132 printf8.o(i._printf_core) i._printf_post_padding 0x10000d28 Section 38 printf8.o(i._printf_post_padding)
or global symobls
__scatterload_zeroinit 0x100008a4 ARM Code 24 handlers.o(i.__scatterload_zeroinit) Region$$Table$$Base 0x10000d80 Number 0 anon$$obj.o(Region$$Table)
the ASR register: - misaligned address abort status - code fetch - the last aborted access was due to the ARM920T
that are all information. So the data abort comes from the printf() function - but why?
most errors occured within the stack - below the stack pointer and sometimes I have errors above the stack pointer (but most of the time in the internal RAM - only one error was in the external sdram (during 20 tries)... Moreover I only get the data aborts if packets were received by the ethernet, but these packets don't use any kind of internal sram, because they are stored in the sdram by the DMA.
best regards Johannes
Have you looked at the overwritten memory area - is there a pattern to the data there? Have you tried to send ethernet packets with specific test data?
You say that the ethernet packets are stored in the sdram. But have you verified that you actually get data there? For example by filling the sdram area with a pattern so you can see that the pattern is overwritten by received data. Just believing that the ethernet data is (should be) stored in the sdram area isn't enough.
Debugging means that you have theories. Then you have to prove or disprove these theories one by one until you find the problem. The big problem is to separate what you believe and what you know, and prove that what you think really is true.
thanks for your answer...
the received data from the ethernet are stored in the sdram. I wrote a small programm to see if I get the data here and yes the data overwrite my pattern. There's one ethernet register where I wrote the address for the received data from the ethernet (transmit to the sdram by dma)
At the moment I have always data abort when data are received by the ethernet - and the address of the data abort is very often within the heap size.
But I don't know why the heap size should be too small. I only read, that the stdio-lib would use a lot of heap size.
my stack ptr is:
__initial_sp 0x00201018 Data 544 rm9200.o(STACK)
and the stack top
STACK 0x00200018 Section 4640 rm9200.o(STACK) Stack_Top 0x00201238 Number 0 rm9200.o(STACK)
the addr of the data abort is e.g. - 00203332 - 00206572 - 0020ACF5 - 00206563
always in the internal RAM: INT_RAM 0x200000 0x000FFFFF (.ANY(+RW +ZI))
And moreover I know that the pointer to the USART will be overwritten very often - sometimes a data abort happens sometimes not. But this occured bevor data will be received.
If I try to write a pattern between the end of the internal sram and the stack top.
ptr = (unsigned int *)0x00206500;
If I received data packets by the ethernet, the pattern is nearly there. Sometimes I get a data abort, too, but not regularly when data are received from the ethernet; but I get nearly never a data abort, if no data will be received by the ethernet
If the address is smaller (e.g. addr = 0x00204000) than I'm not able to run the programm. The programm always stays by PLLA_loop (wait until PLL A is stabilized)...
Is it correct, that the heap base addr is the same as the stack base addr?
Execution Region INTERN_RAM (Base: 0x00200000, Size: 0x00001298, Max: 0x00040000, ABSOLUTE) Base Addr Size Type Attr Idx E Section Name Object 0x00200000 0x0000000c Data RW 13 .data init.o 0x0020000c 0x00000004 Data RW 45 .data usart.o 0x00200010 0x00000004 Data RW 74 .data retarget.o 0x00200014 0x00000060 Zero RW 169 .bss libspace.o(c_t.l) 0x00200074 0x00000004 PAD 0x00200078 0x00000000 Zero RW 2 HEAP rm9200.o 0x00200078 0x00001220 Zero RW 1 STACK rm9200.o
another thing which I don't understand is, that I changed the size of the internal RAM by the linker script.
INTERN_RAM 0x200000 0x00040000
And I still get a data abort with the address 0x002FFFB.
Changing the size of RAM in your configuration files will not change the physical memory in the processor, or how the memory controller in the processor handles exceptions.
If you have a pointer (or return address) that points to a specific address, the processor will try to jump there without the slightest knowledge of the existence of memory.
that means, that the beginning of the heap will not change if I reduce the size of the internal RAM.
How can I find out from which code / function the data abort will caused? At the moment I only know the address of the data abort.
What could be the problem if I write a pattern to the addr 0x00205241
for(i=0; i<100; i++) { *ptr2=0xFF7; *ptr2++; }
(the stack top is 0x00200798) and I'm not able to transmit any data by the usart? I always get a data abort (independent if there are ethernet packets).
If I only write:
ptr2 = (unsigned int *)0x00205241; *ptr2=0xFF7;
then I get a data abort to this address (caused by this write access)
But why could that be a problem, the end of the internal RAM is 0x002FFFFF
This happens because you are violating the alignment restrictions when reading or writing to memory. An exception will be raised if:
read/write word32: (address & 3) != 0 read/write word16: (address & 1) != 0