Haidl M, Gorlatch S
Research article (journal) | Peer reviewedProgramming many-core systems with accelerators (e.g., GPUs) remains a challenging task, even for expert programmers. In the current, low-level approaches -- OpenCL and CUDA -- two distinct programming models are employed: the host code for the CPU is written in C/C++ with a restricted memory model, while the device code for the accelerator is written using a device-dependent model of CUDA or OpenCL. The programmer is responsible for explicitly specifying parallelism, memory transfers, and synchronization, and also for configuring the program and optimizing its performance for a particular many-core system.This leads to long, poorly structured and error-prone codes, often with a suboptimal performance. We present PACXX - an alternative, unified programming approach for accelerators. In PACXX, both host and device programs are written in the same programming language - the newest C++14 standard with the Standard Template Library (STL), including all modern features: type inference (auto), variadic templates, generic lambda expressions, and the newly proposed parallel extensions of the STL. PACXX includes an easy-to-use and type-safe API for multi-stage programming which allows for aggressive runtime compiler optimizations. We implement PACXX by developing a custom compiler (based on the Clang and LLVM frameworks) and a runtime system, that together perform memory management and synchronization automatically and transparently for the programmer. We evaluate our approach by comparing it to OpenCL regarding program size and target performance.
| Gorlatch, Sergei | Professorship of Practical Comupter Science (Prof. Gorlatch) |
| Haidl, Michael | Professorship of Practical Comupter Science (Prof. Gorlatch) |