The platform I am using is ARM926EJS. Cache policy is write-back and only read-allocate. From the profile result, the program I want to optimize has too many write misses (write buffer refill) Can anyone give me some guidelines or tricks to improve my program? thanks.