Abstract
Building a reusable, auto-tuning code generator from scratch is a challenging problem, requiring many careful design choices. We describe HSpiral, a Haskell compiler for signal transforms that builds on the foundational work of Spiral. Our design leverages many Haskell language features to ensure that our framework is reusable, flexible, and efficient. As well as describing the design of our system, we show how to extend it to support new classes of transforms, including the number-theoretic transform and a variant of the split-radix algorithm that results in reduced operation counts. We also show how to incorporate rewrite rules into our system to reproduce results from previous literature on code generation for the fast Fourier transform.
Although the Spiral project demonstrated significant advances in automatic code generation, it has not been widely used by other researchers. HSpiral is freely available under an MIT-style license, and we are actively working to turn it into a tool to further both our own research goals and to serve as a foundation for other research groups' work in developing new implementations of signal transform algorithms.
- Emil Axelsson, Koen Claessen, Gergley Dévai, Zoltán Horváth, Karin Keijzer, Bo Lyckegård, Anders Persson, Mary Sheeran, Josef Svenningsson, and András Vajdax. 2010. Feldspar: A Domain Specific Language for Digital Signal Processing Algorithms. In Eighth ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMOCODE '10). Grenoble, France, 169-178. Google Scholar
Digital Library
- Leo I. Bluestein. 1970. A Linear Filtering Approach to the Computation of Discrete Fourier Transform. IEEE Transactions on Audio and Electroacoustics 18, 4 (Dec. 1970), 451-455.Google Scholar
Cross Ref
- Cristiano Calcagno, Walid Taha, Liwen Huang, and Xavier Leroy. 2003. Implementing Multi-Stage Languages Using ASTs, Gensym, and Reflection. In Proceedings of the 2nd International Conference on Generative Programming and Component Engineering (GPCE '03). Springer, Erfurt, Germany, 57-76. Google Scholar
Digital Library
- Manuel M. T. Chakravarty, Gabriele Keller, Sean Lee, Trevor L. Mc-Donell, and Vinod Grover. 2011. Accelerating Haskell Array Codes with Multicore GPUs. In Proceedings of the Sixth Workshop on Declarative Aspects of Multicore Programming (DAMP '11). ACM, Austin, Texas, USA, 3-14. Google Scholar
Digital Library
- Manuel M. T. Chakravarty, Gabriele Keller, and Simon Peyton Jones. 2005. Associated Type Synonyms. In Proceedings of the Tenth ACM SIGPLAN International Conference on Functional Programming (ICFP '05). ACM, Tallinn, Estonia, 241-253. Google Scholar
Digital Library
- Koen Claessen, Mary Sheeran, and Bo Joel Svensson. 2012. Expressive Array Constructs in an Embedded GPU Kernel Programming Language. In Proceedings of the 7th Workshop on Declarative Aspects and Applications of Multicore Programming (DAMP '12). ACM, Philadelphia, PA, 21-30. Google Scholar
Digital Library
- R. Clint Whaley, Antoine Petitet, and Jack J. Dongarra. 2001. Automated Empirical Optimizations of Software and the ATLAS Project. Parallel Comput. 27, 1-2 (Jan. 2001), 3-35.Google Scholar
Cross Ref
- Franz Franchetti, Frédéric de Mesmay, Daniel McFarlin, and Markus Püschel. 2009. Operator Language: A Program Generation Framework for Fast Kernels. In Proceedings of the IFIP TC 2 Working Conference on Domain Specific Languages (DSL '09) (Lecture Notes in Computer Science), Vol. 5658. Springer, Oxford, UK, 385-410. Google Scholar
Digital Library
- Franz Franchetti, Tze-Meng Low, Stefan Mitsch, Juan Pablo Mendoza, Liangyan Gui, Amarin Phaosawasdi, David Padua, Soummya Kar, José M. F. Moura, M. Franusich, Jeremy Johnson, André Platzer, and Manuela Veloso. 2017. High-Assurance SPIRAL: End-to-End Guarantees for Robot and Car Control. IEEE Control Systems 37, 2 (April 2017), 82-103.Google Scholar
- Franz Franchetti and Markus Puschel. 2003. Short Vector Code Generation for the Discrete Fourier Transform. In Proceedings International Parallel and Distributed Processing Symposium (IPDPS '03). Nice, France. Google Scholar
Digital Library
- Franz Franchetti, Yevgen Voronenko, and Markus Püschel. 2005. Formal Loop Merging for Signal Transforms. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Languages Design and Implementation (PLDI '05). Chicago, IL, 315-326. Google Scholar
Digital Library
- Franz Franchetti, Yevgen Voronenko, and Markus Püschel. 2006. FFT Program Generation for Shared Memory: SMP and Multicore. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (SC '06). Google Scholar
Digital Library
- Franz Franchetti, Yevgen Voronenko, and Markus Püschel. 2006. A Rewriting System for the Vectorization of Signal Transforms. In High Performance Computing for Computational Science (VECPAR '06) (Lecture Notes in Computer Science), Vol. 4395. Springer, Rio de Janeiro, Brazil, 363-377. Google Scholar
Digital Library
- Matteo Frigo. 1999. A Fast Fourier Transform Compiler. In Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '99). Atlanta, Georgia, 169-180. Google Scholar
Digital Library
- Matteo Frigo and Steven G Johnson. 2005. The Design and Implementation of FFTW3. Proc. IEEE 93, 2 (Feb. 2005), 216-231. Special issue on "Program Generation, Optimization, and Platform Adaptation".Google Scholar
Cross Ref
- I. J. Good. 1958. The Interaction Algorithm and Practical Fourier Analysis. Journal of the Royal Statistical Society. Series B (Methodological) 20, 2 (1958), 361-372. http://www.jstor.org/stable/2983896Google Scholar
Cross Ref
- I. J. Good. 1960. The Interaction Algorithm and Practical Fourier Analysis: An Addendum. Journal of the Royal Statistical Society. Series B (Methodological) 22, 2 (1960), 372-375. http://www.jstor.org/stable/2984108Google Scholar
Cross Ref
- Michael T. Heideman and C. Sidney Burrus. 1986. On the Number of Multiplications Necessary to Compute a Length-2n DFT. IEEE Transactions on Acoustics, Speech, and Signal Processing 34, 1 (Feb. 1986), 91-95.Google Scholar
Cross Ref
- Ralf Hinze. 2000. Deriving Backtracking Monad Transformers. In Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming (ICFP '00). ACM, Montreal, Canada, 186-197. Google Scholar
Digital Library
- J. R. Johnson, R. W. Johnson, D. Rodriguez, and R. Tolimieri. 1990. A Methodology for Designing, Modifying, and Implementing Fourier Transform Algorithms on Various Architectures. Circuits, Systems and Signal Processing 9, 4 (Dec. 1990), 449-500. Google Scholar
Digital Library
- Steven G. Johnson and Matteo Frigo. 2007. A Modified Split-Radix FFT with Fewer Arithmetic Operations. IEEE Transactions on Signal Processing 55, 1 (Jan. 2007), 111-119. Google Scholar
Digital Library
- Gabriele Keller, Manuel M.T. Chakravarty, Roman Leshchinskiy, Simon Peyton Jones, and Ben Lippmeier. 2010. Regular, Shape-Polymorphic, Parallel Arrays in Haskell. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming (ICFP '10). ACM, Baltimore, MD, 261-272. Google Scholar
Digital Library
- Oleg Kiselyov, Chung-chieh Shan, Daniel P. Friedman, and Amr Sabry. 2005. Backtracking, Interleaving, and Terminating Monad Transformers (Functional Pearl). In Proceedings of the Tenth ACM SIGPLAN International Conference on Functional Programming (ICFP '05). ACM, Tallinn, Estonia, 192-203. Google Scholar
Digital Library
- Oleg Kiselyov, Kedar N. Swadi, and Walid Taha. 2004. A Methodology for Generating Verified Combinatorial Circuits. In Proceedings of the 4th ACM International Conference on Embedded Software (EMSOFT '04). ACM, Pisa, Italy, 249-258. Google Scholar
Digital Library
- Oleg Kiselyov and Walid Taha. 2004. Relating FFTW and Split-Radix. In Proceedings of the First International Conference on Embedded Software and Systems (ICESS '04). Lecture Notes in Computer Science, Vol. 3605. Springer, Hangzhou, China, 488-493. Google Scholar
Digital Library
- Sheng Liang, Paul Hudak, and Mark Jones. 1995. Monad Transformers and Modular Interpreters. In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '95). ACM Press, San Francisco, CA, 333-343. Google Scholar
Digital Library
- Rudolf Lidl and Günter Pilz. 1997. Applied Abstract Algebra (2nd ed.). Springer, New York.Google Scholar
- Ben Lippmeier, Manuel Chakravarty, Gabriele Keller, and Simon Peyton Jones. 2012. Guiding Parallel Array Fusion with Indexed Types. In Proceedings of the 2012 Symposium on Haskell (Haskell '12). ACM, Copenhagen, Denmark, 25-36. Google Scholar
Digital Library
- Geoffrey Mainland. 2007. Why It's Nice to Be Quoted: Quasiquoting for Haskell. In Proceedings of the ACM SIGPLAN Workshop on Haskell (Haskell '07). Freiburg, Germany, 73-82. Google Scholar
Digital Library
- Geoffrey Mainland and Greg Morrisett. 2010. Nikola: Embedding Compiled GPU Functions in Haskell. In Proceedings of the Third ACM Symposium on Haskell (Haskell '10). Baltimore, MD, 67-78. Google Scholar
Digital Library
- Lingchuan Meng. 2015. Automatic Library Generation and Performance Tuning for Modular Polynomial Multiplication. Ph.D. Dissertation. Drexel University, Philadelphia, PA.Google Scholar
- Peter Milder, Franz Franchetti, James C. Hoe, and Markus Püschel. 2012. Computer Generation of Hardware for Linear Digital Signal Processing Transforms. ACM Transactions on Design Automation of Electronic Systems 17, 2 (April 2012), 15:1-15:33. Google Scholar
Digital Library
- Georg Ofenbeck, Tiark Rompf, Alen Stojanov, Martin Odersky, and Markus Püschel. 2013. Spiral in Scala: Towards the Systematic Construction of Generators for Performance Libraries. In Proceedings of the 12th International Conference on Generative Programming: Concepts & Experiences (GPCE '13). Indianapolis, IN, 125-134. Google Scholar
Digital Library
- Simon Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, and Geoffrey Washburn. 2006. Simple Unification-Based Type Inference for GADTs. In Proceedings of the Eleventh ACM SIGPLAN International Conference on Functional Programming (ICFP '06). ACM, Portland, Oregon, 50-61. Google Scholar
Digital Library
- Markus Püschel, José M. F. Moura, Jeremy Johnson, David Padua, Manuela Veloso, Bryan Singer, Jianxin Xiong, Franz Franchetti, Aca Gacic, Yevgen Voronenko, Kang Chen, Robert W. Johnson, and Nicholas Rizzolo. 2005. Spiral: Code Generation for DSP Transforms. Proc. IEEE 93, 2 (Feb. 2005), 232-275.Google Scholar
- Markus Püschel, José M. F. Moura, Bryan Singer, Jianxin Xiong, Jeremy Johnson, David Padua, Manuela Veloso, and Robert W. Johnson. 2004. Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Alogorithms. International Journal of High Performance Computing Applications 18, 1 (Feb. 2004), 21-45. Google Scholar
Digital Library
- Charles M. Rader. 1968. Discrete Fourier Transforms When the Number of Data Samples Is Prime. Proc. IEEE 56, 6 (June 1968), 1107-1108.Google Scholar
Cross Ref
- Tiark Rompf and Martin Odersky. October 10-13, 2010. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. In Proceedings of the Ninth International Conference on Generative Programming and Component Engineering (GPCE '10). Eindhoven, The Netherlands, 127-136. Google Scholar
Digital Library
- Martin Schönert and others. 1997. GAP - Groups, Algorithms, and Programming - Version 3 Release 4 Patchlevel 4. Lehrstuhl D für Mathematik, Rheinisch Westfälische Technische Hochschule, Aachen, Germany.Google Scholar
- Richard Tolimieri, Myoung An, and Chao Lu. 1997. Algorithms for Discrete Fourier Transform and Convolution (2nd ed.). Springer, New York.Google Scholar
- Charles Van Loan. 1992. Computational Frameworks for the Fast Fourier Transform. Society for Industrial and Applied Mathematics, Philadelphia. Google Scholar
Digital Library
- Yevgen Voronenko. 2008. Library Generation for Linear Transforms. Ph.D. Dissertation. Electrical and Computer Engineering, Carnegie Mellon University. Google Scholar
Digital Library
- Yevgen Voronenko, Frédéric de Mesmay, and Markus Püschel. 2009. Computer Generation of General Size Linear Transform Libraries. In Proceedings of the 7th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO '09). Seattle, WA, 102-113. Google Scholar
Digital Library
- Jianxin Xiong. 2002. SPL: A Language and Compiler for DSP Algorithms. https://web.archive.org/web/20041114220047/ http://polaris.cs.uiuc.edu/spl/Google Scholar
- Jianxin Xiong, Jeremy Johnson, Robert Johnson, and David Padua. 2001. SPL: A Language and Compiler for DSP Algorithms. In Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation (PLDI '01). Snowbird, Utah, 298-308. Google Scholar
Digital Library
- R. Yavne. 1968. An Economical Method for Calculating the Discrete Fourier Transform. In Proceedings of the AFIPS Fall Joint Computer Conference (AFIPS '68). San Francisco, California, 115-125. Google Scholar
Digital Library
- Weihua Zheng, Kenli Li, and Keqin Li. 2014. Scaled Radix-2/8 Algorithm for Efficient Computation of Length-N = 2m DFTs. IEEE Transactions on Signal Processing 62, 10 (May 2014), 2492-2503. Google Scholar
Digital Library
Index Terms
A Haskell compiler for signal transforms
Recommendations
A Haskell compiler for signal transforms
GPCE 2017: Proceedings of the 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and ExperiencesBuilding a reusable, auto-tuning code generator from scratch is a challenging problem, requiring many careful design choices. We describe HSpiral, a Haskell compiler for signal transforms that builds on the foundational work of Spiral. Our design ...
The Fast Hartley Transform Algorithm
The fast Hartley transform (FHT) is similar to the Cooley-Tukey fast Fourier transform (FFT) but performs much faster because it requires only real arithmetic computations compared to the complex arithmetic computations required by the FFT. Through use ...
A SIMD Vectorizing Compiler for Digital Signal Processing Algorithms
IPDPS '02: Proceedings of the 16th International Symposium on Parallel and Distributed ProcessingShort vector SIMD instructions on recent microprocessors,such as SSE on Pentium III and 4, speed up code but are a major challenge to software developers. We present a compiler that automaticallygenerates C code enhanced with short vector instructions ...







Comments