skip to main content
research-article

A Self-Aware Tuning and Self-Aware Evaluation Method for Finite-Difference Applications in Reconfigurable Systems

Published:04 July 2014Publication History
Skip Abstract Section

Abstract

Finite-difference methods are computationally intensive and required by many applications. Parameters of a finite-difference algorithm, such as grid size, can be varied to generate design space which contains algorithm instances with different constant coefficients. An algorithm instance with specific coefficients can either be mapped into general operators to construct static designs, or be implemented as constant-specific operators to form dynamic designs, which require runtime reconfiguration to update algorithm coefficients. This article proposes a tuning method to explore the design space to optimise both the static and the dynamic designs, and an evaluation method to select the design with maximum overall throughput, based on algorithm characteristics, design properties, available resources and runtime data size. For benchmark applications option pricing and Reverse-Time Migration (RTM), over 50% reduction in resource consumption has been achieved for both static designs and dynamic designs, while meeting precision requirements. For a single hardware implementation, the RTM design optimised with the proposed approach is expected to run 1.8 times faster than the best published design. The tuned static designs run thousands of times faster than the dynamic designs for algorithms with small data size, while the tuned dynamic designs achieve up to 5.9 times speedup over the corresponding static designs for large-scale finite-difference algorithms.

References

  1. M. Araya-Polo, J. Cabezas, M. Hanzich et al. 2011. Assessing accelerator-based HPC reverse time migration. IEEE Trans. Parallel Distrib. Syst. 22, 147--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Banescu, F. De Dinechin, B. Pasca, and R. Tudoran. 2010. Multipliers for floating-point double precision and beyond on FPGAs. SIGARCH Comput. Archit. News 38, 4, 73--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Becker, Q. Jin, W. Luk, and S. Weston. 2011. Dynamic constant reconfiguration for explicit finite difference option pricing. In Proceedings of the International Conference on Reconfigurable Computing and FPGAs. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Bruneel, F. Abouelella, and D. Stroobandt. 2009. Automatically mapping applications to a selfreconfiguring platform. In Proceedings of the Conference and Exhibition on Design, automation and Test in Europe. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. G. Charney, R. Fjortoft, and J. Von Neumann. 1950. Numerical integration of the barotropic vorticity equation. Tellus 2, 237--254.Google ScholarGoogle ScholarCross RefCross Ref
  6. K. Datta, M. Murphy, V. S. Volkov, W. Williams, J. Carter, L. Oliker, D. Patterson, J. Shalf, and K. Yelick. 2008. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In Proceedings of the ACM/IEEE Conference on Supercomputing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. de Dinechin and B. Pasca. 2009. Large multipliers with fewer DSP blocks. In Proceedings of the International Conference on Field Programmable Logic and Applications.Google ScholarGoogle Scholar
  8. F. de Dinechin and B. Pasca. 2011. Designing custom arithmetic data paths with FLOPOCO. IEEE Des. Test Comput. 28, 4, 18--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Duhem, F. Muller, and P. Lorenzini. 2012. Reconfiguration time overhead on field programmable gate arrays: reduction and cost model. IET Comput. Digital Tech. 6, 2, 105--113.Google ScholarGoogle ScholarCross RefCross Ref
  10. E. El-Araby, I. Gonzalez, and T. El-Ghazawi. 2009. Exploiting partial runtime reconfiguration for high-performance reconfigurable computing. ACM Trans. Reconfigurable Technol. Syst. 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Fu and R. G. Clapp. 2011. Eliminating the memory bottleneck: an Conference on Field-Programmable Gate Arrays-based solution for 3d reverse time migration. In Proceedings of the Conference on Field-Programmable Gate Arrays. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Hull. 2005. Options, Futures and Other Derivatives 6th Ed. Prentice Hall.Google ScholarGoogle Scholar
  13. Y. Iskander, S. Craven, A. Chandrasekharan, S. Rajagopalan, G. Subbarayan, T. Frangieh, and C. Patterson. 2010. Using partial reconfiguration and high-level models to accelerate FPGA design validation. In Proceedings of the International Conference on Field Programmable Technology.Google ScholarGoogle Scholar
  14. Q. Jin, T. Becker, W. Luk, and D. B. Thomas. 2012. Optimising explicit finite difference option pricing for dynamic constant reconfiguration. In Proceedings of the International Conference on Field Programmable Logic and Applications.Google ScholarGoogle Scholar
  15. T. Kobori, T. Maruyama, and T. Hoshino. 2001. A cellular automata system with fpga. In Proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines. 120--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Koch and J. Torresen. 2011. A high performance sorting architecture exploiting run-time reconfiguration on FPGAs for large problem sorting. In Proceedings of the Conference on Field-Programmable Gate Arrays. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Lu and K. Magerlein. 2013. Multi-level parallel computing of reverse time migration for seismic imaging on blue gene/q. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 291--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Micikevicius. 2009. 3D finite difference computation on GPUs using CUDA. In Proceedings of the 2nd Workshop on General Purpose Processing on Graphics Processing Units. 79--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. F. Nava, D. Sciuto, M. D. Santambrogio, S. Herbrechtsmeier, M. Porrmann, U. Witkowski, and U. Rueckert. 2010. Applying dynamic reconfiguration in the mobile robotics domain: A case study on computer vision algorithms. ACM Trans. Reconfigurable Technol. Syst. 4, 29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. X. Niu, T. C. P. Chau, Q. Jin, W. Luk, and Q. Liu. 2013a. Automating elimination of idle functions by run-time reconfiguration. In Proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. X. Niu, J. G. F. Coutinho, Y. Wang, and W. Luk. 2013b. Dynamic stencil: Effective exploitation of run-time resources in reconfigurable clusters. In Proceedings of the International Conference on Field Programmable Technology. 214--221.Google ScholarGoogle Scholar
  22. X. Niu, Q. Jin, W. Luk, Q. Liu, and O. Pell. 2012. Exploiting run-time reconfiguration in stencil computation. In Proceedings of the International Conference on Field Programmable Logic and Applications. 173--180.Google ScholarGoogle Scholar
  23. O. Pell, J. A. Bower, R. G. Dimond, O. Mencer, and M. J. Flynn. 2013. Finite-difference wave propagation modeling on special-purpose dataflow machines. IEEE Trans. Parallel Distrib. Syst. 24, 5, 906--915. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Perrone, L.-K. Liu, L. Lu, K. Magerlein, C. Kim, I. Fedulova, and A. Semenikhin. 2012. Reducing data movement costs: Scalable seismic imaging on blue gene. In Proceedings of the International Parallel and Distributed Processing Symposium. 320--329. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. E. Phillips and M. Fatica. 2010. Implementing the himeno benchmark with CUDA on GPU clusters. In Proceedings of the International Parallel and Distributed Processing Symposium.Google ScholarGoogle Scholar
  26. G. W. Reitwiesner. 1960. Binary arithmetic. Advances Computers 1, 261--265.Google ScholarGoogle Scholar
  27. K. Sano, Y. Hatsuda, and S. Yamamoto. 2011. Scalable streaming-array of simple soft-processors for stencil computations with constant memory-bandwidth. In Proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L. Singhal and E. Bozorgzadeh. 2006. Multi-layer floorplanning on a sequence of reconfigurable designs. In Proceedings of the International Conference on Field Programmable Logic and Applications.Google ScholarGoogle Scholar
  29. R. H. Turner and R. F. Woods. 2004. Highly efficient, limited range multipliers for lut-based FPGA architectures. IEEE Trans. VLSI Syst. 12, 10, 1113--1118. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Self-Aware Tuning and Self-Aware Evaluation Method for Finite-Difference Applications in Reconfigurable Systems

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Reconfigurable Technology and Systems
        ACM Transactions on Reconfigurable Technology and Systems  Volume 7, Issue 2
        June 2014
        199 pages
        ISSN:1936-7406
        EISSN:1936-7414
        DOI:10.1145/2638850
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 July 2014
        • Accepted: 1 March 2014
        • Revised: 1 June 2013
        • Received: 1 January 2013
        Published in trets Volume 7, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!