Abstract
The Square Kilometre Array (SKA) project will be the world’s largest radio telescope array. With its large number of antennas, the number of signals that need to be processed is dramatic. One important element of the SKA’s Central Signal Processor package is pulsar search. This article focuses on the FPGA-based acceleration of the Frequency-Domain Acceleration Search module, which is a part of SKA pulsar search engine. In this module, the frequency-domain input signals have to be processed by 85 Finite Impulse response (FIR) filters within a short period of limitation and for thousands of input arrays. Because of the large scale of the input length and FIR filter size, even high-end FPGA devices cannot parallelise the task completely. We start by investigating both time-domain FIR filter (TDFIR) and frequency-domain FIR filter (FDFIR) to tackle this task. We applied the overlap-add algorithm to split the coefficient array of TDFIR and the overlap-save algorithm to split the input signals of FDFIR. To achieve fast prototyping design, we employed OpenCL, which is a high-level FPGA development technique. The performance and power consumption are evaluated using multiple FPGA devices simultaneously and compared with GPU results, which is achieved by porting FPGA-based OpenCL kernels. The experimental evaluation shows that the FDFIR solution is very competitive in terms of performance, with a clear energy consumption advantage over the GPU solution.
- Mohamed S. Abdelfattah, Andrei Hagiescu, and Deshanand Singh. 2014. Gzip on a chip: High performance lossless data compression on fpgas using opencl. In Proceedings of the International Workshop on OpenCL. ACM, 4. Google Scholar
Digital Library
- AMD. 2013. APP SDK-A Complete Development Platform. Retrieved from http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/.Google Scholar
- AMD. 2015. AMD APP SDK OpenCL Optimization Guide. Retrieved from http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf.Google Scholar
- Hugo A. Andrade and Scott Kovner. 1998. Software synthesis from dataflow models for G and LabVIEW/sup TM. In Proceedings of the 32nd Asilomar Conference on Signals, Systems 8 Computers, Vol. 2. IEEE, 1705--1709.Google Scholar
- Andrew Canis, Jongsok Choi, Mark Aldham, Victor Zhang, Ahmed Kammoona, Jason H. Anderson, Stephen Brown, and Tomasz Czajkowski. 2011. LegUp: High-level synthesis for FPGA-based processor/accelerator systems. In Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays. ACM, 33--36. Google Scholar
Digital Library
- Andrew Canis, Jongsok Choi, Mark Aldham, Victor Zhang, Ahmed Kammoona, Tomasz Czajkowski, Stephen D. Brown, and Jason H. Anderson. 2013. LegUp: An open-source high-level synthesis tool for FPGA-based processor/accelerator systems. ACM Trans. Embed. Comput. Syst. 13, 2 (2013), 24. Google Scholar
Digital Library
- Christopher Carilli and Steve Rawlings. 2004. Science with the square kilometer array: Motivation, key science projects, standards and assumptions. arXiv preprint astro-ph/0409274.Google Scholar
- Doris Chen and Deshanand Singh. 2012. Invited paper: Using OpenCL to evaluate the efficiency of CPUS, GPUS, and FPGAS for information filtering. In Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL’12). IEEE, 5--12.Google Scholar
Cross Ref
- Doris Chen and Deshanand Singh. 2013. Fractal video compression in OpenCL: An evaluation of CPUs, GPUs, and FPGAs as acceleration platforms. In Proceedings of the 18th Asia and South Pacific Design Automation Conference (ASP-DAC’13). IEEE, 297--304.Google Scholar
Cross Ref
- Fei Chen, Yi Shan, Yu Zhang, Yu Wang, Hubertus Franke, Xiaotao Chang, and Kun Wang. 2014. Enabling FPGAs in the cloud. In Proceedings of the 11th ACM Conference on Computing Frontiers. ACM, 3. Google Scholar
Digital Library
- Michael A. Clark, P. C. La Plante, and Lincoln J. Greenhill. 2013. Accelerating radio astronomy cross-correlation with graphics processing units. Int. J. High Perform. Comput. Appl. 27, 2 (2013), 178--192. Google Scholar
Digital Library
- Tomasz S. Czajkowski, Utku Aydonat, Dmitry Denisenko, John Freeman, Michael Kinsner, David Neto, Jason Wong, Peter Yiannacouras, and Deshanand P. Singh. 2012. From OpenCL to high-performance hardware on FPGAs. In Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL’12). IEEE, 531--534.Google Scholar
- Tomasz S. Czajkowski, David Neto, Michael Kinsner, Utku Aydonat, Jason Wong, Dmitry Denisenko, Peter Yiannacouras, John Freeman, Deshanand P. Singh, and Stephen D. Brown. 2012. OpenCL for FPGAs: Prototyping a compiler. In Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA’12). The Steering Committee of the World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), 1.Google Scholar
- Ludovico De Souza, John D. Bunton, Ducan Campbell-Wilson, Roger J. Cappallo, and Bart Kincaid. 2007. A radio astronomy correlator optimized for the Xilinx Virtex-4 SX FPGA. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’07). IEEE, 62--67.Google Scholar
Cross Ref
- Peter E. Dewdney, Peter J. Hall, Richard T. Schilizzi, and T. Joseph L. W. Lazio. 2009. The square kilometre array. Proc. IEEE 97, 8 (2009), 1482--1496.Google Scholar
Cross Ref
- Stephen A. Edwards. 2006. The challenges of synthesizing hardware from C-like languages. IEEE Design Test Comput. 23, 5 (2006), 375--386. Google Scholar
Digital Library
- Jeff Fifield, Ronan Keryell, Hervé Ratigner, Henry Styles, and Jim Wu. 2016. Optimizing OpenCL applications on Xilinx FPGA. In Proceedings of the 4th International Workshop on OpenCL. ACM, 5. Google Scholar
Digital Library
- Jeremy Fowers, Greg Brown, John Wernsing, and Greg Stitt. 2013. A performance and energy comparison of convolution on GPUs, FPGAs, and multicore processors. ACM Trans. Architect. Code Optim. 9, 4 (2013), 25. Google Scholar
Digital Library
- Mario Garrido, Jesús Grajal, M. A. Sánchez, and Oscar Gustafsson. 2013. Pipelined radix-feedforward FFT architectures. IEEE Trans. Very Large Scale Integr. Syst. 21, 1 (2013), 23--32. Google Scholar
Digital Library
- Khronos OpenCL Working Group et al. 2008. The OpenCL Specification, version 1.0. 29. Retrieved from https://www.khronos.org/registry/cl/specs/opencl-1.0.29.pdf.Google Scholar
- Giulia Guidi, Enrico Reggiani, Lorenzo Di Tucci, Gianluca Durelli, Michaela Blott, and Marco D. Santambrogio. 2016. On how to improve FPGA-based systems design productivity via SDAccel. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops. IEEE, 247--252.Google Scholar
- Intel. 2016. Intel SDK for OpenCL Best Practices Guide. Retrieved from https://www.intel.com/content/www/us/en/programmable/documentation/mwh1391807516407.html.Google Scholar
- Intel. 2016. Intel SDK for OpenCL Programming Guide. Retrieved from https://www.intel.com/content/www/us/en/programmable/documentation/mwh1391807309901.html.Google Scholar
- S. Jouteux, R. Ramachandran, B. W. Stappers, P. G. Jonker, and M. Van Der Klis. 2002. Searching for pulsars in close circular binary systems. Astron. Astrophys. 384, 2 (2002), 532--544.Google Scholar
Cross Ref
- Nasser Kehtarnavaz and Sidharth Mahotra. 2010. Digital Signal Processing Laboratory: LabVIEW-Based FPGA Implementation. Universal-Publishers. Google Scholar
Digital Library
- Yanbing Li, Tim Callahan, Ervan Darnell, Randolph Harr, Uday Kurkure, and Jon Stockwood. 2000. Hardware-software co-design of embedded reconfigurable architectures. In Proceedings of the 37th Annual Design Automation Conference. ACM, 507--512. Google Scholar
Digital Library
- Walid Najjar and Jason Villarreal. 2013. FPGA code accelerators-the compiler perspective. In Proceedings of the 50th Annual Design Automation Conference. ACM, 141. Google Scholar
Digital Library
- Denis Navarro, Oscar Lucia, Luis Angel Barragan, Isidoro Urriza, and Oscar Jimenez. 2013. High-level synthesis for accelerating the FPGA implementation of computationally demanding control algorithms for power converters. IEEE Trans. Industr. Info. 9, 3 (2013), 1371--1379.Google Scholar
Cross Ref
- Aaron Parsons, Dan Werthimer, Donald Backer, Tim Bastian, Geoffrey Bower, Walter Brisken, Henry Chen, Adam Deller, Terry Filiba, Dale Gary et al. 2009. Digital instrumentation for the radio astronomy community. arXiv preprint arXiv:0904.1181 (2009).Google Scholar
- Karas Pavel and Svoboda David. 2013. Algorithms for Efficient Computation of Convolution. INTECH Open Access Publisher.Google Scholar
- Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Jan Gray et al. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. In Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture (ISCA’14). IEEE, 13--24. Google Scholar
Digital Library
- Scott M. Ransom, Stephen S. Eikenberry, and John Middleditch. 2002. Fourier techniques for very long astrophysical time-series analysis. Astronom. J. 124, 3 (2002), 1788.Google Scholar
Cross Ref
- M. A. Sanchez, Mario Garrido, Marisa López-Vallejo, Jesús Grajal, and Carlos López-Barrio. 2005. Digital channelised receivers on FPGAs platforms. In Proceedings of the IEEE International Radar Conference. IEEE, 816--821.Google Scholar
Cross Ref
- Moritz Schmid, Christian Schmitt, Frank Hannig, Gorker Alp Malazgirt, Nehir Sonmez, Arda Yurdakul, and Adrian Cristal. 2016. Big data and HPC Acceleration with Vivado HLS. In FPGAs for Software Programmers. Springer, 115--136.Google Scholar
- Steven W. Smith et al. 1997. Digital signal processors. The Scientist and Engineer's Guide to Digital Signal Processing. 503--534. Retrieved from http://www.dspguide.com/CH28.PDF.Google Scholar
- Rob V. Van Nieuwpoort and John W. Romein. 2009. Using many-core hardware to correlate radio astronomy signals. In Proceedings of the 23rd International Conference on Supercomputing. ACM, 440--449. Google Scholar
Digital Library
- Rob V. van Nieuwpoort and John W. Romein. 2011. Correlating radio astronomy signals with many-core hardware. Int. J. Parallel Program. 39, 1 (2011), 88--114.Google Scholar
Cross Ref
- Haomiao Wang and Oliver Sinnen. 2015. FPGA-based acceleration of FDAS module for pulsar search. In Proceedings of the International Conference on Field Programmable Technology (FPT’15). IEEE, 240--243.Google Scholar
Cross Ref
- Loring Wirbel. 2014. Xilinx SDAccel: A Unified Development Environment for Tomorrow Data Center. Technical Report. Technical Report, The Linley Group, Inc.Google Scholar
- Zhiru Zhang, Yiping Fan, Wei Jiang, Guoling Han, Changqi Yang, and Jason Cong. 2008. AutoPilot: A platform-based ESL synthesis system. In High-Level Synthesis. Springer, 99--112.Google Scholar
Index Terms
FPGA-based Acceleration of FT Convolution for Pulsar Search Using OpenCL
Recommendations
Nuclear Reactor Simulations on OpenCL FPGA Platform
FPGA '19: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysField-programmable gate arrays (FPGAs) are becoming a promising choice as a heterogeneous computing component for scientific computing when floating-point optimized architectures are added to the current FPGAs. The maturing high-level synthesis (HLS) ...
Nuclear Reactor Simulation on OpenCL FPGA: a Case Study of RSBench
IWOCL '18: Proceedings of the International Workshop on OpenCLField-programmable gate arrays (FPGAs) are becoming a promising choice as a heterogeneous computing component for scientific computing when floating-point optimized architectures are added to the current FPGAs. The emerging high-level synthesis tools ...
Real-Time FPGA Implementation of FIR Filter Using OpenCL Design
AbstractThis paper proposes the implementation of a real-time finite impulse response (FIR) filter with a field-programmable gate array (FPGA) and Open Computing Language (OpenCL) designed by directly streaming the input signal. OpenCL is selected for its ...






Comments