Schulze, Richard; Gorlatch, Sergei; Rasch, Ari
Forschungsartikel in Online-Sammlung (Konferenz) | Peer reviewedDirective-based programming models have established themselves as an effective and productive paradigm for exploiting parallel architectures such as GPUs and CPUs. Widely used solutions—such as OpenMP and OpenACC—are popular due to their simplicity and broad applicability to general-purpose codebases. However, these approaches often struggle to deliver consistently high and portable performance, especially for reduction-intensive computations. We introduce a novel directive design grounded in the formalism of Multi-Dimensional Homomorphisms (MDH). Unlike existing directive-based methods, our MDH-based directive is explicitly crafted for data-parallel computations (such as tensor expressions), enabling superior and portable performance even on reduction-heavy workloads. At the same time, our approach preserves and often enhances programmer productivity, e.g., by leveraging Python as the host language. Our experiments across diverse workloads—including linear algebra, stencil computations, data mining, quantum chemistry, and deep learning—show that our approach not only surpasses state-of-the-art directive-based methods but also achieves up to 5 × speedups on CPUs and over 2 × on GPUs compared to highly optimized vendor libraries Intel MKL/oneMKL and NVIDIA cuBLAS/cuDNN.
| Gorlatch, Sergei | Professur für Praktische Informatik (Prof. Gorlatch) Institut für Informatik |
| Rasch, Ari | Professur für Praktische Informatik (Prof. Gorlatch) Institut für Informatik |
| Schulze, Richard Heinrich Hermann | Professur für Praktische Informatik (Prof. Gorlatch) Institut für Informatik |