Abstract
AMBA AXI is a popular bus protocol that is widely adopted as the medium to exchange data in field-programmable gate array system-on-chips (FPGA SoCs). The AXI protocol does not specify how conflicting transactions are arbitrated and hence the design of bus arbiters is left to the vendors that adopt AXI. Typically, a round-robin arbitration is implemented to ensure a fair access to the bus by the master nodes, as for the popular SoCs by Xilinx.
This paper addresses a critical issue that can arise when adopting the AXI protocol under round-robin arbitration; specifically, in the presence of bus transactions with heterogeneous burst sizes. First, it is shown that a completely unfair bandwidth distribution can be achieved under some configurations, making possible to arbitrarily decrease the bus bandwidth of a target master node. This issue poses serious performance, safety, and security concerns. Second, a low-latency (one clock cycle) module named AXI burst equalizer (ABE) is proposed to restore fairness. Our investigations and proposals are supported by implementations and tests upon three modern SoCs. Experimental results are reported to confirm the existence of the issue and assess the effectiveness of the ABE with bus traffic generators and hardware accelerators from the Xilinx’s IP library.
- ARM. 2012. AMBA AXI and ACE Protocol Specification. ARM.Google Scholar
- Luca Benini and Giovanni De Micheli. 2002. Networks on chips: A new SoC paradigm. IEEE Computer 35, 1 (2002), 70--78.Google Scholar
Digital Library
- Alessandro Biondi, Alessio Balsini, Marco Pagani, Enrico Rossi, Mauro Marinoni, and Giorgio Buttazzo. 2016. A framework for supporting real-time applications on dynamic reconfigurable FPGAs. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS).Google Scholar
Cross Ref
- Roman Bourgade, Christine Rochange, Marianne De Michiel, and Pascal Sainrat. 2010. MBBA: A multi-bandwidth bus arbiter for hard real-time. In 5th Intâl Conference on Embedded and Multimedia Computing (EMC).Google Scholar
Cross Ref
- Roman Bourgade, Christine Rochange, and Pascal Sainrat. 2011. Predictable bus arbitration schemes for heterogeneous time-critical workloads running on multicore processors. In IEEE 16th Conference on Emerging Technologies 8 Factory Automation (ETFA’11). IEEE, 1--4.Google Scholar
Cross Ref
- Paolo Burgio, Martino Ruggiero, Francesco Esposito, Mauro Marinoni, Giorgio Buttazzo, and Luca Benini. 2010. Adaptive TDMA bus allocation and elastic scheduling: A unified approach for enhancing robustness in multi-core RT systems. In IEEE International Conference on Computer Design (ICCD’10). IEEE, 187--194.Google Scholar
Cross Ref
- Jason Cong, Michael Gill, Yuchen Hao, Glenn Reinman, and Bo Yuan. 2015. On-chip interconnection network for accelerator-rich architectures. In Proceedings of the 52nd Annual Design Automation Conference. ACM, 8.Google Scholar
Digital Library
- Abbas Eslami Kiasari, Zhonghai Lu, and Axel Jantsch. 2013. An analytical latency model for networks-on-chip. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 21, 1 (2013), 113--123.Google Scholar
Digital Library
- Akash Kumar, Andreas Hansson, Jos Huisken, and Henk Corporaal. 2007. An FPGA design flow for reconfigurable network-based multi-processor systems on chip. In 2007 Design, Automation 8 Test in Europe Conference 8 Exhibition. IEEE, 1--6.Google Scholar
- Kanishka Lahiri, Anand Raghunathan, and Ganesh Lakshminarayana. 2006. The LOTTERYBUS on-chip communication architecture. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 14, 6 (2006), 596--608.Google Scholar
Digital Library
- Bu-Ching Lin, Geeng-Wei Lee, Juinn-Dar Huang, and Jing-Yang Jou. 2007. A precise bandwidth control arbitration algorithm for hard real-time SoC buses. In Proceedings of the 2007 Asia and South Pacific Design Automation Conference. IEEE Computer Society, 165--170.Google Scholar
Digital Library
- Razvan Nane, Vlad-Mihai Sima, Christian Pilato, Jongsok Choi, Blair Fort, Andrew Canis, Yu Ting Chen, Hsuan Hsiao, Stephen Brown, Fabrizio Ferrandi, et al. 2016. A survey and evaluation of FPGA high-level synthesis tools. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 35, 10 (2016), 1591--1604.Google Scholar
Digital Library
- Marco Pagani, Enrico Rossi, Alessandro Biondi, Mauro Marinoni, Giuseppe Lipari, and Giorgio Buttazzo. 2019. A bandwidth reservation mechanism for AXI-based hardware accelerators on FPGAs. In 31st Euromicro Conference on Real-Time Systems (ECRTS 2019) (Leibniz International Proceedings in Informatics (LIPIcs)), Sophie Quinton (Ed.), Vol. 133. Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 24:1--24:24. DOI:https://doi.org/10.4230/LIPIcs.ECRTS.2019.24Google Scholar
- Francesco Poletti, Davide Bertozzi, Luca Benini, and Alessandro Bogliolo. 2003. Performance analysis of arbitration policies for SoC communication architectures. Design Automation for Embedded Systems 8, 2–3 (2003), 189--210.Google Scholar
Digital Library
- Thomas D Richardson, Chrysostomos Nicopoulos, Dongkook Park, Vijaykrishnan Narayanan, Yuan Xie, Chita Das, and Vijay Degalahal. 2006. A hybrid SoC interconnect with dynamic TDMA-based transaction-less buses and on-chip networks. In VLSI Design, 2006. Held jointly with 5th International Conference on Embedded Systems and Design., 19th International Conference on. IEEE, 8--pp.Google Scholar
Digital Library
- Hardik Shah, Andreas Raabe, and Alois Knoll. 2011. Priority division: A high-speed shared-memory bus arbitration with bounded latency. In Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE), 2011. IEEE, 1--4.Google Scholar
Cross Ref
- Éricles Sousa, Deepak Gangadharan, Frank Hannig, and Juergen Teich. 2014. Runtime reconfigurable bus arbitration for concurrent applications on heterogeneous MPSoC architectures. In 17th Euromicro Conference on Digital System Design (DSD’14). IEEE, 74--81.Google Scholar
Digital Library
- Xilinx 2016. Zynq-7000 All Programmable SoC - Reference Manual. Xilinx. UG585.Google Scholar
- Xilinx 2017. Zynq UltraScale+ Device - Reference Manual. Xilinx. UG1085.Google Scholar
- Xilinx 2018. AXI Interconnect, LogiCORE IP Product Guide. Xilinx. PG059.Google Scholar
- Xilinx 2018. Convolutional Encoder, LogiCORE IP Product Guide. Xilinx. PG026.Google Scholar
- Xilinx 2018. Fast Fourier Transform, LogiCORE IP Product Guide. Xilinx. PG109.Google Scholar
- Xilinx 2018. FIR Compiler, LogiCORE IP Product Guide. Xilinx. PG149.Google Scholar
- Xilinx 2018. SmartConnect, LogiCORE IP Product Guide. Xilinx. PG247.Google Scholar
- Xilinx 2018. Versal: Adaptive Compute Acceleration Platform. Xilinx. WP505.Google Scholar
- Ching-Chien Yuan, Yu-Jung Huang, Shih-Jhe Lin, and Kai-hsiang Huang. 2008. A reconfigurable arbiter for SOC applications. In IEEE Asia Pacific Conference on Circuits and Systems (APCCAS’08). IEEE, 713--716.Google Scholar
- Heechul Yun, Gang Yao, Rodolfo Pellizzoni, Marco Caccamo, and Lui Sha. 2016. Memory bandwidth management for efficient performance isolation in multi-core platforms. IEEE Trans. Comput. 65, 2 (2016), 562--576.Google Scholar
Digital Library
Index Terms
Is Your Bus Arbiter Really Fair? Restoring Fairness in AXI Interconnects for FPGA SoCs
Recommendations
Design of AXI bus based MPSoC on FPGA
ASID'09: Proceedings of the 3rd international conference on Anti-Counterfeiting, security, and identification in communicationWhile the computational core is becoming faster and faster, the communication efficiency between the processors has become a bottleneck which limits the performance of multiprocessor system-on-chip (MPSoC). This paper focuses on design and ...
PPMB: A Partial-Multiple-Bus Multiprocessor Architecture with Improved Cost-Effectiveness
The authors address the design and performance analysis of partial-multiple-bus interconnection networks. They are bus architectures that have evolved from the multiple-bus structure by dividing buses into groups and reducing bus connections. Their ...
Design and implementation of a reconfigurable arbiter
SSIP'07: Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image ProcessingThe SOC design paradigm relies on well-defined interfaces and reuse of intellectual property (IP). Because more and more IPs are integrated into the design platform, the amount of communication between the IPs is on the increase and becomes the source ...






Comments