This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How best to maximize cache-write utilization for gpu-compute?

What are some best practices for preventing data from being written out to RAM when structuring a compute job on the GPU that requires a small amount of data? For example, if I wanted to do 10M read/write operations on a contiguous 1024B array and finally output, say, 1024B, would this be automatically cached or are there things that should be done to make caching more likely?