We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hi
As part of my MSc Scientific Computing at UCL, I've been doing some Linpack benchmarking of an 8 node Raspberry Pi 4 Model B cluster. I have posted some preliminary results in the README.md at:
github.com/.../picluster
These results are not the usual "Problem Size vs Gflops". I know in advance that I want to use a problem size utilising 80% of memory, so they are "NB vs Gflops". I'm trying to determine the optimum NB for a given problem size. I'm particularly interested in this because I want to make optimum use of the limited networking resources, whilst maintaining efficient load balancing.
Are these results reasonable? Any suggestions for further investigation? All comments/feedback would be most welcome.
Please don't take too much notice of the multi-node results at the moment, I know I have some NET_RX softirq issues to resolve through NET_RX interrupt coalescing and NIC receive buffer increases. Through some initial experiemtation, I know I can gain an additional 10Gflops across 8 nodes.
Kind regards