Abstract
Computer architects need to run cycle-accurate performance models of processors orders of magnitude faster. We discuss why the speedup on traditional multicores is limited, and why FPGAs represent a good vehicle to achieve a dramatic performance improvement over software models. This article introduces A-Port Networks, a simulation scheme designed to expose the fine-grained parallelism inherent in performance models and efficiently exploit them using FPGAs.
- Arvind, Asanovic, K., Chiou, D., Hoe, J. C., Kozyrakis, C., Lu, S., Oskin, M., Patterso, D., Rabaey, J., and Wawryznek, J. 2006. Ramp: Research accelerator for multiple processors---a community vision for a shared experimental parallel hw/sw platform. Tech. rep. University of California, Berkeley.Google Scholar
- Barr, K. C., Matas-Navarro, R., Weaver, C., Juan, T., and Emer, J. 2005. Simulating a chip multiprocessor with a symmetric multiprocessor. In Proceedings of the Boston Area Archictecture Workshop (BARC).Google Scholar
- Bryant, R. 1979. Simulation on a distributed system. In Proceedings of the 1st International Conference on Distributed Systems.Google Scholar
- Carloni, L., McMillan, K., and Sangiovanni-Vincentelli, A. 2001. Theory of latency-insensitive design. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst.Google Scholar
Digital Library
- Chandy, K. M. and Misra, J. 1981. Asynchronous parallel simulation via a sequence of parallel computations. Comm. ACM, 198--206. Google Scholar
Digital Library
- Chiou, D., Sunwoo, D., Kim, J., Patil, N. A., Reinhart, W. H., Johnson, D. E., Keefe, J., and Angepat, H. 2007a. FPGA-accelerated simulation technologies FAST: Fast, full-system, cycle-accurate simulators. In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture (MICRO’07). Google Scholar
Digital Library
- Chiou, D., Sunwoo, D., Kim, J., Patil, N. A., Reinhart, W. H., Johnson, D. E., and Xu, Z. 2007b. The fast methodology for high-speed soc/computer simulation. In Proceedings of the International Conference on Computer-Aided Design (ICCAD’07). Google Scholar
Digital Library
- Chung, E., Nurvitadhi, E., Mai, J. H. K., and Falsafi, B. 2008. Accelerating Architectural-level, Full-System Multiprocessor Simulations using FPGAs. In Proceedings of the 11th International Symposium on Field Programmable Gate Arrays (FPGA’08).Google Scholar
- Commoner, F., Holt, A., Even, S., and Pnueli, A. 1971. Marked directed graphs. J. Comput. Syst. Sci. 5.Google Scholar
Digital Library
- Emer, J., Ahuja, P., Borch, E., Klauser, A., Luk, C. K., Manne, S., Mukherjee, S. S., Patil, H., Wallace, S., Binkert, N., Espasa, R., and Juan, T. 2002. Asim: A performance model framework. Computer, 68--76. Google Scholar
Digital Library
- Gibeling, G., Schultz, A., and Asanovic, K. 2006. The ramp architecture and description language. Tech. rep. University of California, Berkeley.Google Scholar
- Kahn, G. 1974. The Semantics of a Simple Language for Parallel Programming. In J. L. Rosenfeld Ed., Information Processing, North Holland, 471--475.Google Scholar
- Lee, E. A. and Messerschmitt, D. G. 1987. Static scheduling of synchronous data ow programs for digital signal processing. IEEE Trans. Comput. Google Scholar
Digital Library
- Pellauer, M., Vijayaraghavan, M., Adler, M., Arvind, and Emer, J. 2008a. A-ports: An efficient abstraction for cycle-accurate performance models on FPGAs. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’08). Google Scholar
Digital Library
- Pellauer, M., Vijayaraghavan, M., Adler, M., Arvind, and Emer, J. 2008b. Quick performance models quickly: Closely-coupled timing-directed simulation on FPGAs. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’08). Google Scholar
Digital Library
- Penry, D. A., Fay, D., Hodgdon, D., Wells, R., Schelle, G., August, D. I., and Con nors, D. 2006. Exploiting parallelism and structure to accelerate the simulation of chip multi-processors. In Proceedings of the 12th International Symposium on High-Performance Computer Architecture (HPCA’06).Google Scholar
Cross Ref
- Pfister, G. 1982. The yorktown simulation engine. In Proceedings of the 19th Conference on Design Automation (DAC’82). Google Scholar
Digital Library
- Ray, J. and Hoe, J. C. 2003. High-level modeling and FPGA prototyping of microprocessors. In Proceedings of the ACM/SIGDA 11th International Symposium on Field Programmable Gate Arrays (FPGA’03). Google Scholar
Digital Library
- Wawrzynek, J., Patterson, D., Oskin, M., Lu, S. L., Kozyrakis, C., Hoe, J. C., Chiou, D., and Asanovic, K. 2007. Ramp: A research accelerator for multiple processors. In Proceedings of the Annual ACM/IEEE International Symposium on Microarchitecture (MICRO’07).Google Scholar
- Wunderlich, R. E. and Hoe, J. C. 2004. In-System FPGA Prototyping of an Itanium Microarchitecture. In Proceedings of the IEEE International Conference on Computer Design (ICCD’04). Google Scholar
Digital Library
Index Terms
A-Port Networks: Preserving the Timed Behavior of Synchronous Systems for Modeling on FPGAs
Recommendations
A-Ports: an efficient abstraction for cycle-accurate performance models on FPGAs
FPGA '08: Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arraysRecently there has been interest in using FPGAs as a platform for cycle-accurate performance models. We discuss how the properties of FPGAs make them a good platform to achieve a performance improvement over software models. Some metrics are developed ...
MRP: mix real cores and pseudo cores for FPGA-based chip-multiprocessor simulation
DATE '15: Proceedings of the 2015 Design, Automation & Test in Europe Conference & ExhibitionFacing the speed bottleneck of software-based simulators, FPGA-based simulation has been explored more and more. This paper proposes a novel methodology to simulate a chip-multiprocessor (CMP) on the limited FPGA resource. By mixing real cores and ...
Reprogrammable network packet processing on the field programmable port extender (FPX)
FPGA '01: Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arraysA prototype platform has been developed that allows processing of packets at the edge of a multi-gigabit-per-second network switch. This system, the Field Programmable Port Extender (FPX), enables packet processing functions to be implemented as modular ...






Comments