Karl Rupp, Institute for Microelectronics, Vienna University of Technology (TU Wien)

Download (PDF, 736KB)

The performance portability of OpenCL kernel implementations for common memory-bandwidth limited linear algebra operations across different hardware generations of the same vendor as well as across vendors is studied. Certain combinations of kernel implementations and work sizes are found to exhibit good performance across compute kernels, hardware generations, and, to a lesser degree, vendors. As a consequence, it is demonstrated that the optimization of a single kernel is often sufficient to obtain good performance for a large class of more complicated operations.

Karl Rupp is a postdoctoral researcher in computational microelectronics at the Vienna University of Technology and the main developer of the OpenCL-enabled linear algebra library ViennaCL. The main driver for Karl and his team is to provide highly convenient, yet fast and performance portable implementations to practitioners and non-experts in order to better use the performance available through parallel hardware.