Heterogeneous Computing Using Modern C++ with OpenCL Devices

Pre-requisites for this Tutorial

Attendees are expected to have programming experience with C++ and a laptop either running Linux or having a VM manager installed such as VirtualBox. The required software will be provided on USB-sticks. This course is suitable for beginners, but is focused on intermediate to advanced parallel programming using C++.

Tutorial Overview

SYCL™ is a royalty-free, cross-platform C++ abstraction layer that builds on the underlying concepts, portability and efficiency of OpenCL™, while adding the ease-of-use and flexibility of modern C++11/14. For example, SYCL enables single source development where C++ template functions are compiled for both host and device to construct complex algorithms that use OpenCL acceleration, and then re-use them throughout their source code on different types of data.

Using SYCL can simplify development and reduce the amount of code required for applications using OpenCL devices by over 50% compared to standard OpenCL code. This is because of the use of template functions and a simplified, streamlined host API.

This hands-on session will provide an opportunity to get experience with SYCL using ComputeCpp™ Community Edition, a free to use implementation of the SYCL 1.2 standard. Attendees will be shown how to set up ComputeCpp and use it to write their own SYCL code to run on supported GPUs and CPUs.

The Khronos™ Group maintains the OpenCL and SYCL standards, both designed to offer dispatch using C and C++ to heterogeneous devices such as GPUs, integrated CPUs, DSPs and even FPGAs. The C++ Standard is also building towards similar support starting with the C++17 Parallel and Concurrency Technical Specifications. With the help of early design experience from SYCL, and other heterogeneous computing models, it will enable a single high-level performance-portable programming standard for programming autonomous vehicles, computer vision, and neural networks.

SYCL is already able to dispatch to heterogeneous devices and it implements C++17 ParallelSTL, augmenting it with ability to dispatch to GPUs in addition to CPUs. This tutorial will demonstrate how to write parallel SYCL code and how to use the Khronos Group’s experimental Parallel STL implementation. The course outline is as follows

  • Start with a basic SYCL program that shows how to submit queues in a single task and stream-like object, comparing CPU, SYCL and OpenCL versions
  • Demonstrate how to access data across host and GPUs using buffers and accessors, the importance of life-time, and basic parallel constructs

Attendees are expected to have programming experience with C++ and a laptop either running Linux or having a VM manager installed such as VirtualBox. The required software will be provided on USB-sticks. This course is suitable for beginners, but is focused on intermediate to advanced parallel programming using C++.


Presenters

Rod Burns, Codeplay

Ruyman Reyes, Codeplay