Abstract
Probabilistic Sentential Decision Diagrams (PSDDs) provide efficient methods for modeling and reasoning with probability distributions in the presence of massive logical constraints. PSDDs can also be synthesized from graphical models such as Bayesian networks (BNs) therefore offering a new set of tools for performing inference on these models (in time linear in the PSDD size). Despite these favorable characteristics of PSDDs, we have found multiple challenges in PSDD’s FPGA acceleration. Problems include limited parallelism, data dependency, and small pipeline iterations. In this article, we propose several optimization techniques to solve these issues with novel pipeline scheduling and parallelization schemes. We designed the PSDD kernel with a high-level synthesis (HLS) tool for ease of implementation and verified it on the Xilinx Alveo U250 board. Experimental results show that our methods improve the baseline FPGA HLS implementation performance by 2,200X and the multicore CPU implementation by 20X. The proposed design also outperforms state-of-the-art BN and Sum Product Network (SPN) accelerators that store the graph information in memory.
- [1] . 2008. BEE3 (Berkeley Emulation Engine). Retrieved 24 July, 2022 from https://www.microsoft.com/en-us/research/project/bee3/.Google Scholar
- [2] . 2016. SDDs are exponentially more succinct than OBDDs. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, 929–935.Google Scholar
Cross Ref
- [3] . 2018. VIBNN: Hardware acceleration of Bayesian neural networks. ACM SIGPLAN Notices 53, 2 (2018), 476–488.Google Scholar
Digital Library
- [4] . 2006. On the robustness of most probable explanations. In Proceedings of the 22nd Conference in Uncertainty in Artificial Intelligence.Google Scholar
- [5] . 2008. On probabilistic inference by weighted model counting. Artificial Intelligence 172, 6 (2008), 772–799.Google Scholar
Digital Library
- [6] . 2006. Compiling relational Bayesian networks for exact inference. International Journal of Approximate Reasoning 42, 1–2 (2006), 4–20.Google Scholar
Digital Library
- [7] . 2021. ThunderGP: HLS-based graph processing framework on FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 69–80.Google Scholar
Digital Library
- [8] . 2015. Tractable learning for structured probability spaces: A case study in learning preference distributions. In Proceedings of the 24th International Joint Conference on Artificial Intelligence. AAAI Press, 2861–2868.Google Scholar
- [9] . 2017. Tractability in structured probability spaces. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems. 3477–3485.Google Scholar
- [10] . 2016. Structured features in naive bayes classification. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. AAAI Press, 3233–3240.Google Scholar
Cross Ref
- [11] . 2018. HLS-based optimization and design space exploration for applications with variable loop bounds. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design. IEEE, 1–8.Google Scholar
Digital Library
- [12] . 2011. Automatic memory partitioning and scheduling for throughput and power optimization. ACM Transactions on Design Automation of Electronic Systems 16, 2 (2011), 1–25.Google Scholar
Digital Library
- [13] Jason Cong, Bin Liu, Stephen Neuendorffer, Juanjo Noguera, Kees Vissers, and Zhiru Zhang. 2011. High-level synthesis for FPGAs: From prototyping to deployment. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 30, 4(2011), 473–491.Google Scholar
Digital Library
- [14] . 2003. A differential approach to inference in Bayesian networks. Journal of the ACM 50, 3 (2003), 280–305.Google Scholar
Digital Library
- [15] . 2009. Modeling and Reasoning with Bayesian Networks. Cambridge University Press.Google Scholar
Cross Ref
- [16] . 2011. SDD: A new canonical representation of propositional knowledge bases. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 819–826.Google Scholar
- [17] . 2020. Three modern roles for logic in AI. In Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems.ACM, 229–243.Google Scholar
Digital Library
- [18] . 2022. Tractable boolean and arithmetic circuits. In Proceedings of the Neuro-symbolic Artificial Intelligence: The State of the Art, and (Eds.), Vol. 342, Frontiers in Artificial Intelligence and Applications. IOS Press, Chapter 6.Google Scholar
- [19] . 2014. Runtime observer pairs and bayesian network reasoners on-board FPGAs: Flight-certifiable system health management for embedded systems. In Proceedings of the International Conference on Runtime Verification.Springer, 215–230.Google Scholar
Cross Ref
- [20] . 1966. Bounds for certain multiprocessing anomalies. Bell System Technical Journal 45, 9 (1966), 1563–1581.Google Scholar
Cross Ref
- [21] . 2021. AutoBridge: Coupling coarse-grained floorplanning and pipelining for high-frequency HLS design on multi-die FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays.Google Scholar
Digital Library
- [22] . 2014. Probabilistic sentential decision diagrams. In Proceedings of the 14th Conference on the Principles of Knowledge Representation and Reasoning. AAAI Press.Google Scholar
- [23] . 2009. Probabilistic Graphical Models - Principles and Techniques. MIT Press.Google Scholar
Digital Library
- [24] . 2021. Efficient Operator Sharing Modulo Scheduling for Sum-Product Network Inference on FPGAs. Retrieved 24 July, 2022 from https://www.esa.informatik.tu-darmstadt.de/assets/publications/materials/2021/2021_SAMOS_HK.pdf.Google Scholar
- [25] . 2010. High-throughput Bayesian computing machine with reconfigurable hardware. In Proceedings of the 18th Annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 73–82.Google Scholar
Digital Library
- [26] . 2012. Machine Learning - A Probabilistic Perspective. MIT Press.Google Scholar
Digital Library
- [27] . 2017. Diagnosis via arithmetic circuit compilation of Bayesian network and calculation on FPGA. In Proceedings of the 13th IEEE International Conference on Electronic Measurement & Instruments. 35–41.Google Scholar
Cross Ref
- [28] . 1989. Probabilistic Reasoning in Intelligent Systems - Networks of Plausible Inference. Morgan Kaufmann.Google Scholar
- [29] . 2008. New compilation languages based on structured decomposability. In Proceedings of the 23rd AAAI Conference on Artificial Intelligence. AAAI Press, 517–522.Google Scholar
- [30] . 2011. Sum-product networks: A new deep architecture. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence. AUAI Press, 337–346.Google Scholar
Cross Ref
- [31] . 2014. Cutset networks: A simple, tractable, and scalable approach for improving the accuracy of chow-liu trees. In Proceedings of the Machine Learning and Knowledge Discovery in Databases - European Conference. Springer, 630–645.Google Scholar
Digital Library
- [32] . 2013. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the 24th ACM Symposium on Operating Systems Principles. 472–488.Google Scholar
Digital Library
- [33] . 2020. Acceleration of probabilistic reasoning through custom processor architecture. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 322–325.Google Scholar
Cross Ref
- [34] . 2020. Acceleration of probabilistic reasoning through custom processor architecture. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. IEEE, 322–325.Google Scholar
Cross Ref
- [35] . 2016. Tractable operations for arithmetic circuits of probabilistic models. In Proceedings of the Advances in Neural Information Processing Systems. 3936–3944.Google Scholar
- [36] . 2018. Conditional PSDDs: Modeling and learning with modular knowledge. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. AAAI Press, 6433–6442.Google Scholar
Cross Ref
- [37] . 2018. Automatic mapping of the sum-product network inference problem to FPGA-based accelerators. In Proceedings of the IEEE 36th International Conference on Computer Design. 350–357.Google Scholar
Cross Ref
- [38] . 2020. Comparison of arithmetic number formats for inference in sum-product networks on FPGAs. In Proceedings of the IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines. 75–83.Google Scholar
Cross Ref
- [39] . 2020. Vitis Unified Software Platform. Retrieved 24 July, 2022 from https://www.xilinx.com/products/design-tools/vitis/vitis-platform.html.Google Scholar
- [40] . 2020. Vivado High-Level Synthesis (UG902). Retrieved 24 July, 2022 from https://www.xilinx.com/support/documentation/sw_manuals/xilinx2019_2/ug902-vivado-high-level-synthesis.pdf.Google Scholar
- [41] . 2021. Alveo U250 Data Center Accelerator Card. Retrieved 24 July, 2022 from https://www.xilinx.com/products/boards-and-kits/alveo/u250.html.Google Scholar
- [42] . 2021. UltraScale Architecture Memory Resources. Retrieved 24 July, 2022 from https://www.xilinx.com/support/documentation/user_guides/ug573-ultrascale-memory-resources.pdf.Google Scholar
- [43] . 2015. FPGA implementation of Bayesian network inference for an embedded diagnosis. In Proceedings of the 2015 IEEE Conference on Prognostics and Health Management. 1–10.Google Scholar
Cross Ref
- [44] . 2019. Hitgraph: High-throughput graph processing framework on FPGA. IEEE Transactions on Parallel and Distributed Systems 30, 10 (2019), 2249–2264.Google Scholar
Cross Ref
- [45] . 2010. FPGA acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods. BMC Bioinformatics 11, 1 (2010), 1–12.Google Scholar
Cross Ref
Index Terms
FPGA Acceleration of Probabilistic Sentential Decision Diagrams with High-level Synthesis
Recommendations
Bit-level optimization for high-level synthesis and FPGA-based acceleration
FPGA '10: Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arraysAutomated hardware design from behavior-level abstraction has drawn wide interest in FPGA-based acceleration and configurable computing research field. However, for many high-level programming languages, such as C/C++, the description of bitwise access ...
Building zynq® accelerators with Vivado® high level synthesis
FPGA '13: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arraysEngineering complex systems inevitably requires a designer to balance many conflicting design requirements including performance, cost, power, and design time. In many cases, FPGAs enable engineers to balance these design requirements in ways not ...
Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis
FPGA '17: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysCurrent pipelining approach in high-level synthesis (HLS) achieves high performance for applications with regular and statically analyzable memory access patterns. However, it cannot effectively handle infrequent data-dependent structural and data ...






Comments