We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
If you're only writing to the output (not reading and writing) then I'm not sure there's much you can do besides - write the output in at least 32-bit chunks (maybe even larger, e.g. STM) -- writing bytes will stall the write buffer sooner==> Yes, write 32-bit chunks is better thas byte only. But what STM helps here, we know arm9's write buffer doesn't support write merge. Will STM make all store write to the same write buffer entry? - make sure you're only writing the output once==> yes, I am sure most of the cases are writing once. - write the output in consecutive ascending addresses (actually, that probably only helps if the output is already in the cache, which I'm guessing is not happening here)==> Does write order affect the performance if the data in the cache or not in the cache? - try to find out if the memory timing is set as fast as possible in whatever memory controller you're using