Please do not say 'paging' I can not handle the overhead for all the other stuff that fit nicely within 64k. I have a 2Mbyte flash that occasionally is accessed and the time here is not critical (the system pauses) as opposed to other processes that use only RAM. Is there an elegant way to access structures in that 24 bit address space or does it have to be bent folded and mutilated to access? Currently all addresses are specified with an 8 bit 'page' and a 16 bit 'address' and processed as such. I could gain some readability by having the whole in a long. Also this method require data to be stored so no structure cross a page boundary and that limitation is a nuisance. Erik
You may optimize XBANKING.A51 yourself to come up with a better solution. If you have found something better (which is thread-safe and works in all circumstances) please send your suggestion to: support.intl@keil.com.
OK clarification 99.9% of the time 'flash does not exist' and time is EXTREMELY critical. then rarely a routine is called (sketched here)
U8 ReadFlash( U32 address) { U8 byte; P4 = ( address >> 16) & 0xff SWITCH_IO_TO_FLASH byte = *(address &0xffff) SWITCH_IO_TO_RAM return (byte) }
I'm not quite sure I understand the problem. ReadFlash (&structx.offsety) should only pass in one value, which already combines the structure address + offset. And since the routine only reads one byte, that one byte can't cross a boundary. If you were reading a longer word, then you'll have to change the bank register mid-stream. If you're passing in the address and offset seperately, then you'll just have to add them before taking the address apart. If you use far pointers only for your flash addresses, and explicitly declare others as xdata*, then the compiler should only call the XBANKING routines for the far pointers, which is to say the flash accesses. Other, 16-bit, accesses would be performed as normal. The library routines seem to compare favorably to the example code with the shifts and masks, even with the bank overhead.
LOAD_BANK MACRO SaveAcc LOCAL lab MOV DPL,R1 MOV DPH,R2 MOV ?C?XPAGE1SFR,R3 DEC ?C?XPAGE1SFR ANL ?C?XPAGE1SFR,#07FH CJNE R3,#80H,lab ; test high bit of R3 to set carry lab: ENDM RESTORE_BANK MACRO SaveAcc MOV ?C?XPAGE1SFR,#?C?XPAGE1RST ; Reset Page Register ENDM ;----------------------------------------------------------------------------- ; CLDXPTR: Load BYTE in A via Address given in R1/R2/R3 ; Registers which can be used without saving: DPTR, CY, A ; ?C?CLDXPTR: LOAD_BANK JNC CLDCODE MOVX A,@DPTR JMP CLDDONE CLDCODE: CLR A MOVC A,@A+DPTR CLDDONE: RESTORE_BANK 1 RET
void ManualRead4 (U8 far* addr) { SetBankReg(addr); byte1 = *(xdata*)addr++; byte2 = *(xdata*)addr++; byte3 = *(xdata*)addr++; byte4 = *(xdata*)addr++; SetBankReg (0); // sets to default value } // ManualRead4
/// provides access to words/bytes of a U32 typedef union { U32 u32; U8 array[4]; struct { MultiByte16 lsw; MultiByte16 msw; } words; } MultiByte32;
The code in XBANKING.A51 does exactly what your ReadFlash routine does. But it is build into the compiler, so when you have pointers, you need not to decide whether the address is now a flash address or a RAM address. The overhead is this decision (which are 5-6 CPU cycles). Of course you may implement your own way of doing it.
And since the routine only reads one byte, that one byte can't cross a boundary. No, but the address can if the structure is located at fff0 and 20 bytes long, the access of the last 4 entries will be 0000, 0001, 0002 and 0003 with 16 bit calculation. Erik
The code in XBANKING.A51 does exactly what your ReadFlash routine does. But it is build into the compiler, so when you have pointers, you need not to decide whether the address is now a flash address or a RAM address. The overhead is this decision (which are 5-6 CPU cycles). I DO NOT want the execution routines since they assume all is 'banked' and when operating in "RAM mode" I can not afford ANY overhead. ALL I WANT is a means of the calculation of the effective address in 32 bit mode. IF the address of an entry in a structure or array is targeted at a 32 bit entity, the calculation should be 32 bit. Erik
I DO NOT want the execution routines since they assume all is 'banked' and when operating in "RAM mode" I can not afford ANY overhead. The XBANKING routines do not add any overhead when accessing CODE, DATA, XDATA, IDATA, PDATA, or BIT memory areas variables. They are only invoked when you use far or const far pointers. IF the address of an entry in a structure or array is targeted at a 32 bit entity, the calculation should be 32 bit. Far memory types are limited to 64K in size and may not cross a 64K boundary. As such, the address calculations for far memory objects are performed using 16-bit arithmetic which reduces code size and increases execution speed. A limitation is that compiler-managed objects may not cross a 64K boundary. ALL I WANT is a means of the calculation of the effective address in 32 bit mode. You can do this using a far pointer with a long typed index but you'll have to do it manually and you'll have to read each byte individually. However, this is only required for those objects that straddle the 64K boundary. And, there are very few of those (only 1 if you're using 128K). Jon
You can do this using a far pointer with a long typed index but you'll have to do it manually and you'll have to read each byte individually. However, this is only required for those objects that straddle the 64K boundary. And, there are very few of those (only 1 if you're using 128K). The problem here is that the data is variable and I do not know which units straddle. Anyhow, I think it has now reached the point where I have to go back to the proplr that make tha software that generate the file that I store in flash and say "make a hole in the file so no units straddle 64k" I know it will cost me a a hefty fee, but oh well if nothing else works, pay. Erik
The problem here is that the data is variable and I do not know which units straddle. Well, that complicates things a bit, but still, couldn't you look at the address and size of the object to determine if it straddles? Jon
Well, that complicates things a bit, but still, couldn't you look at the address and size of the object to determine if it straddles? There is more to it, There are 32 copies of struct a, 64 copies of array b etc. To process the same struct differently depending on its location would create a piece of code that would be a nightmare to debug. Anyhow, I'll see what that cost of requesting a gap would be and if exorbiant, I'll try the suggestions here. Thanks all, Erik
Sounds like the offsetof() macro could come in handy.
if (((U16)addr + offsetof(structType, fieldName) < (U16)addr) { // field straddes 64k boundary } else { // field lies within one 64k segment }
Is this operation really so time-critical that saving a couple of instructions is worthwhile? Flash access is often slower than RAM access. when reading flash timing is of no concern, when NOT reading flash extremely so. Basically the unit run in two modes Haul @$$ (99.9% of the time) work with flash Erik
As Jon mentioned, the extra instructions to set up the high 8 bits of the address apply only to far and const far data. Regular xdata access does not go through these routines, and will not be slowed down by the code in XBANKING.A51. These routines are essentially the "far access library". So long as you don't declare your normal xdata items far or access them via a far pointer, you should be safe. The remaining question seems to be whether or not the actual access pattern is such that detecting the segment boundary is worthwhile. If there's a whole lot of 1-byte reads in the same segment, then you could (in theory) optimize out most of the high-order byte setup, as in the ReadManual4 routine I posted above. It's just a matter of whether the time and code it takes to figure out whether you need to set the high byte is less than the time it takes you just to do it every time. Also, it's perhaps worth considering whether you need consistent execution time for every access, or whether it's okay for some of them to be much longer than others as long as the amortized total is less overall.
Also, it's perhaps worth considering whether you need consistent execution time for every access, or whether it's okay for some of them to be much longer than others as long as the amortized total is less overall. varying execution time is irrelevant, but code that tries to read something by method a and something by method b will by me be considered 'messy' and outlawed. Again, once the flash is in the loop timing is totlly non-critical. I will play a bit with far and !far and see what happens. Erik
one question re banking can the 'main' bank be 64k all I have seen say 'home bank' 32k, bank 1 32k Erik
If you're referring to code banking, the bank size may be anything from 0 to 64K. Typically, you'll have a fixed common area which is stored in a 32K ROM (or something like that) and you'll have banking hardware that switches the upper 32K (or whatever's left). But, there's nothing that prevents you from using only 8K for the common area and 56K for the banked area. If the common area is TOO small, the compiler just merges it into each of the code banks. Using that, you could just have 64K banks and let the compiler use whatever it needed for the common area. Of course, that area would be duplicated in each code bank (but if you keep it small, that's not really an issue). But, that may reduce the amount of development work involved. Jon