ABSTRACT
Building state-of-the-art processors is expensive and time consuming. Once the design is finalized and implemented, simulations are used to evaluate functionality and performance of the system. The Sim-alpha processor simulator is one of the most important tools for performance evaluations. Enhancing processor simulators is one of the major research field and many studies are underway related to this area. Current, open source, processor simulators do not account for the influences caused by multi-processing. In this study, we had shown that most processor simulations only test one program at a time on a virtual processor. The goal of the project was to demonstrate how processor simulators work when external influences are incorporated. Hardware or software interrupts are events that alter sequence of instructions executed by a processor. A context switch occurs when a multitasking operating system suspends the currently running process, and starts executing another. An additional code was added to the Sim-alpha program to allow for context switch. Benchmarks were executed with and without time slice context switch as well as different time slices. The results had shown that when the number of cycles before flushing the cache increases, the miss rate will decrease. For example if we are flushing the cache every 150 cycles, the cache miss rate is 48% compare to 2% without flushing the cache. The effect of flushing the cache is significant on the cache performance of processor simulators. In real life environments, processor must support multiple processes. We demonstrated with a simple change in the code that these simulators can have a more realistic workload. The effect of flushing the cache is significant on the cache performance of processor simulators. Current models do not account for this and may over estimate the performance gains of a particular processor design.
- Agarwal, J. Hennessy , and Horowitz. "Cache Performance of Operating System and multiprogramming workloads." ACM Trans. Comp Sys. Vol. 7, pp89--588, 1988. Google Scholar
Digital Library
- Todd Austin. "User's and Hacker's Guide to the Simplescalare Architecture Research Tool Set." Intel. 1997.Google Scholar
- S. Barenz, S. Boyles, B. Bren, and E. Dow. "The Alpha 21264 Processor." Report No. CSI 3060. 15pp.Google Scholar
- D. Burger, A. Kagi, and M. S. Hrishikesh. "Memory Hierarchy Extensions to the SimpleScalar Tool Set." Dept. of Computer Science, The University of Texas at Austin. Tech Report No. TR99-25.Google Scholar
- M. Co, and K. Skadron. "The Effects of Context Switching on Branch Predictor Performance." Dept. of Computer Science, Univ. of Virginia. Published at the 2001 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).Google Scholar
- Jeanine M. Cook. "Reducing Processor Simulation Time by Using Adaptive, Non-Uniform Model Complexity." Ph.D. Thesis. NMSU. 228 pp, 2002. Google Scholar
Digital Library
- R. Desikan, D. Burger, S. Keckler, and T. Austin. "Sim-alpha: A Validated, Exicution-Driven Alpha 21264 Simulator." Tech report TR-01-23. The University of Texas at Austin. 17p.Google Scholar
- "DineroIV Cache Simulator," in www.wisc.edu/~markill/DineroIVGoogle Scholar
- Jeffery Gee, Mark Hill, and Alan Smith. "Cache Performance of the SPEC92 Benchmark Suite." 1993.Google Scholar
- Jim Handy. "The Cache Memory Book." Academic Press, Inc. 269pp, 1993. Google Scholar
Digital Library
- Tim Hopkins. "Computer Systems Lecture Notes." Computing Laboratory, Univ. of Kent, Canterbury. http://www.dcs.ed.ac.uk/teaching/cs2Google Scholar
- Wen-mei W. Hwu, and Thomas M. Conte . "The Susceptibility of Programs to Context Switching Effects." IEEE Trans. on Computers. Vol. 43, No 9. 1994Google Scholar
Digital Library
- Lizy John. "Performance Evaluation: Techniques, Tools and Benchmarks." Dept. of Electrical & Computer Engineering, University of Texas at Austin.Google Scholar
- Kenji Kise, and Hiroki Honda. "Simcore/Alpha Functional Simulator." Graduate School of Information Systems, University of Electro-Communications, Japan. 2003.Google Scholar
- Hantak Kwak, Ben Lee, Ali R. Hurson, Suk-Han, and Woo-Jong, "Effects of Multithreading on Cache Performance." IEEE Trans. on Computers, Vol. 58, No.2, 1999. Google Scholar
Digital Library
- Jeffery Mogul, and Anita Brog. "The Effect of Context Switches on Cache Performance." ACM ASPLOS. 1991. Google Scholar
Digital Library
- Leah Schoeb & Grag Darnell. "Larger Processor L2 Cache Sizes in Dell PowerEdge Servers. Technical Report No. R20749. Dell Power Solutions, Issue 4, 1999.Google Scholar
- A. Silberschatz, and P. Galvin. "Operating System Concepts." 5th edition. Addison-Wesley Longman, Inc. 888pp, 1998. Google Scholar
Digital Library
- The SimOS Complete System Simulator. http://simos.stanford.edu/Google Scholar
- The Simplescalar Simulator Website. http://www.cs.wisc.edu/~mscalar/simplescalar.htmlGoogle Scholar
- SPEC Benchmarks. www.spec.orgGoogle Scholar
- Universitat Politecnica de Catalunya (UPC), Dept. d'Arquitectura de Computadors. http://people.ac.upc.edu/xteruel/fib/intel/sim.doc/index.htmlGoogle Scholar
Index Terms
Evaluate the performance changes of processor simulator benchmarks When context switches are incorporated
Recommendations
Evaluate the performance changes of processor simulator benchmarks When context switches are incorporated
Building state-of-the-art processors is expensive and time consuming. Once the design is finalized and implemented, simulations are used to evaluate functionality and performance of the system. The Sim-alpha processor simulator is one of the most ...
Increasing hardware data prefetching performance using the second-level cache
Techniques to reduce or tolerate large memory latencies are critical for achieving high processor performance. Hardware data prefetching is one of the most heavily studied solutions, but it is essentially applied to first-level caches where it can ...
An Instruction Fetch Unit for a High-Performance Personal Computer
The instruction fetch unit (IFU) of the Dorado personal computer speeds up the emulation of instructions by prefetching, decoding, and preparing later instructions in parallel with the execution of earlier ones. It dispatches the machine's microcoded ...







Comments