Abstract
Massive spatial parallelism at low energy gives FPGAs the potential to be core components in large scale high performance computing (HPC) systems. In this paper we present four major design steps that harness high-level synthesis (HLS) to implement scalable spatial FPGA algorithms. To aid productivity, we introduce the open source library hlslib to complement HLS. We evaluate kernels designed with our approach on an FPGA accelerator board, demonstrating high performance and board utilization with enhanced programmer productivity. By following our guidelines, programmers can use HLS to develop efficient parallel algorithms for FPGA, scaling their implementations with increased resources on future hardware.
- Uday Bondhugula, Vinayaka Bandishti, and Irshad Pananilath. 2017. Diamond Tiling: Tiling Techniques to Maximize Parallelism for Stencil Computations. TPDS 28, 5 (May 2017), 1285--1298. Google Scholar
Digital Library
- Haohuan Fu and Robert G. Clapp. 2011. Eliminating the Memory Bottleneck: An FPGA-based Solution for 3D Reverse Time Migration. Proceedings of FPGA'11, 65--74. Google Scholar
Digital Library
- Xinyu Niu, Jose G. F. Coutinho, Yu Wang, and Wayne Luk. 2013. Dynamic Stencil: Effective exploitation of run-time resources in reconfigurable clusters. Proceedings of FPT'13.Google Scholar
Cross Ref
- Nirmal Prajapati, Waruna Ranasinghe, Sanjay Rajopadhye, et al. 2017. Simple, Accurate, Analytical Time Modeling and Optimal Tile Size Selection for GPGPU Stencils. Proceedings of PPoPP'17. Google Scholar
Digital Library
- Kentaro Sano, Yoshiaki Hatsuda, and Satoru Yamamoto. 2014. Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory Bandwidth. TPDS 25, 3 (March 2014), 695--705. Google Scholar
Digital Library
- Hasitha M. Waidyasooriya, Yasuhiro Takei, et al. 2017. OpenCL-Based FPGA-Platform for Stencil Computation and Its Optimization Methodology. TPDS 28, 5 (May 2017), 1390--1402. Google Scholar
Digital Library
Index Terms
Designing scalable FPGA architectures using high-level synthesis
Recommendations
Designing scalable FPGA architectures using high-level synthesis
PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingMassive spatial parallelism at low energy gives FPGAs the potential to be core components in large scale high performance computing (HPC) systems. In this paper we present four major design steps that harness high-level synthesis (HLS) to implement ...
High Level Synthesis for Designing Custom Computing Hardware
FCCM '98: Proceedings of the IEEE Symposium on FPGAs for Custom Computing MachinesWe apply High Level Synthesis (HLS) to the design of FPGA based computing systems. HLS allows for a level of design space exploration unrealizable with Register Transfer Level (RTL) techniques. The use of HLS tools allow designers to prototype their ...
Scalable Video Coding Deblocking Filter FPGA and ASIC Implementation Using High-Level Synthesis Methodology
DSD '13: Proceedings of the 2013 Euromicro Conference on Digital System DesignThis paper describes key concepts in the design and implementation of a deblocking filter (DF) for a H.264/SVC video decoder. The DF supports QCIF and CIF video formats with temporal and spatial scalability. The design flow starts from a SystemC ...







Comments