• Deep Learning Episode 3: Supercomputer vs Pong
    I’ve always enjoyed playing games, but the buzz from writing programs that play games has repeatedly claimed months of my conscious thought at a time. I’m not sure that writing programs that write programs...
  • Deep Learning Episode 1: Optimizing DeepMind's A3C on Torch
    In February, a new paper from Google's DeepMind team appeared on arxiv. This one was interesting – they showed dramatically improved performance and training time of their Atari-playing Deep Q-Learning...
  • Deep Learning Episode 2: Scaling TensorFlow over multiple EC2 GPU nodes
    In episode one we optimized Torch A3C performance on the new Intel Xeon Phi (Knight's Landing) CPU. Arm MAP and Performance Reports identified bottlenecks in our framework and sped up model training by...
  • Profiling and Tuning Linpack: A Step-by-Step Guide
    This year we're proud to be sponsoring the Student Cluster Competition at SC15. One of the key codes teams will have to optimize for their systems is the classic Linpack benchmark. I decided to have a...
  • Boosting OpenFOAM behavior with Arm Performance Reports
    OpenFOAM, developed by ESI-OpenCFD is one of the most popular tools for developing CFD (Computational Fluid Dynamics) applications, along with ANSYS Fluent or CD-Adapco Star-CCM+. Most modules of OpenFOAM...