Abstract
Accelerators can offer exceptional performance advantages. However, programmers need to spend considerable efforts on acceleration, without knowing how sustainable the employed programming models, languages and tools are. To tackle this challenge, we propose and demonstrate a new runtime system called HTrOP that is able to automatically generate and execute OpenCL code from sequential CPU code. HTrOP transforms suitable data-parallel loops into independent OpenCL-typical work-items and handles concrete calls to these devices through a mix of library components and application-specific OpenCL host code. Computational hotspots are identified and can be offloaded to different resources (CPU, GPGPU and Xeon Phi). We demonstrate the potential of HTrOP on a broad set of applications and are able to improve the performance by 4.3X on average.
- Marvin Damschen, Heinrich Riebler, Gavin Vaz, and Christian Plessl. 2015. Transparent offloading of computational hotspots from binary code to Xeon Phi. In Proc. Design, Automation and Test in Europe Conf. (DATE). EDA Consortium, 1078--1083. Google Scholar
Digital Library
- Tobias Grosser and Torsten Hoefler. 2016. Polly-ACC Transparent compilation to heterogeneous hardware. In Proceedings of the 2016 International Conference on Supercomputing. ACM, 1. Google Scholar
Digital Library
- Simon Moll. 2011. Decompilation of LLVM IR. Master's thesis (2011).Google Scholar
Index Terms
Automated code acceleration targeting heterogeneous openCL devices
Recommendations
Transparent Acceleration for Heterogeneous Platforms With Compilation to OpenCL
Multi-accelerator platforms combine CPUs and different accelerator architectures within a single compute node. Such systems are capable of processing parallel workloads very efficiently while being more energy efficient than regular systems consisting ...
Automated code acceleration targeting heterogeneous openCL devices
PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingAccelerators can offer exceptional performance advantages. However, programmers need to spend considerable efforts on acceleration, without knowing how sustainable the employed programming models, languages and tools are. To tackle this challenge, we ...
Performance Evaluation and Improvements of the PoCL Open-Source OpenCL Implementation on Intel CPUs
IWOCL'21: International Workshop on OpenCLThe Portable Computing Language (PoCL) is a vendor independent open-source OpenCL implementation that aims to support a variety of compute devices in a single platform. Evaluating PoCL versus the Intel OpenCL implementation reveals significant ...







Comments