skip to main content
research-article

A Machine Learning Methodology for Cache Memory Design Based on Dynamic Instructions

Published:11 March 2020Publication History
Skip Abstract Section

Abstract

Cache memories are an essential component of modern processors and consume a large percentage of their power consumption. Its efficacy depends heavily on the memory demands of the software. Thus, finding the optimal cache for a particular program is not a trivial task and usually involves exhaustive simulation. In this article, we propose a machine learning–based methodology that predicts the optimal cache reconfiguration for any given application, based on its dynamic instructions. Our evaluation shows that our methodology reaches 91.1% accuracy. Moreover, an additional experiment shows that only a small portion of the dynamic instructions (10%) suffices to reach 89.71% accuracy.

References

  1. David H. Albonesi. 1999. Selective cache ways: On-demand cache resource allocation. In Proceedings of the 32nd Annual International Symposium on Microarchitecture (MICRO-32’99). IEEE, 248--259.Google ScholarGoogle ScholarCross RefCross Ref
  2. ANDANDTECH. 2017. Intel Launches 8th Generation Core CPUs, Starting with Kaby Lake Refresh for 15W Mobile. Retrieved August 23, 2017 from http://www.anandtech.com/show/11738/intel-launches-8th-generation-cpus-starting-with-kaby-lake-refresh-for-15w-mobile.Google ScholarGoogle Scholar
  3. Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, et al. 2011. The gem5 simulator. ACM SIGARCH Comput. Arch. News 39, 2 (2011), 1--7.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Garo Bournoutian and Alex Orailoglu. 2013. Application-aware adaptive cache architecture for power-sensitive mobile processors. ACM Trans. Embed. Comput. Syst. 13, 3 (2013), 41.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16 (2002), 321--357.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Nitesh V. Chawla, Nathalie Japkowicz, and Aleksander Kotcz. 2004. Special issue on learning from imbalanced data sets. ACM SIGKDD Explor. Newslett. 6, 1 (2004), 1--6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Christophe Dubach, Timothy M. Jones, and Edwin V. Bonilla. 2013. Dynamic microarchitectural adaptation using machine learning. ACM Trans. Arch. Code Optimiz. 10, 4 (2013), 31.Google ScholarGoogle Scholar
  8. Praveen Elakkumanan, Lushan Liu, V. Kumar Vankadara, and Ramalingam Sridhar. 2005. CHIDDAM: A data mining based technique for cache hierarchy determination in commercial applications. In Proceedings of the 48th Midwest Symposium on Circuits and Systems. IEEE, 1888--1891.Google ScholarGoogle ScholarCross RefCross Ref
  9. Faustino J. Gomez, Doug Burger, and Risto Miikkulainen. 2001. A neuro-evolution method for dynamic resource allocation on a chip multiprocessor. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’01), Vol. 4. IEEE, 2355--2360.Google ScholarGoogle ScholarCross RefCross Ref
  10. Ann Gordon-Ross, Frank Vahid, and Nikil D. Dutt. 2009. Fast configurable-cache tuning with a unified second-level cache. IEEE Trans. VLSI Syst. 17, 1 (2009), 80--91.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. Proceedings of the 4th Annual IEEE International Workshop on Workload Characterization.3--14. DOI:https://doi.org/10.1109/WWC.2001.990739Google ScholarGoogle Scholar
  12. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: An update. ACM SIGKDD Explor. Newslett. 11, 1 (2009), 10--18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Tin Kam Ho. 1995. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Vol. 1. IEEE, 278--282.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Intel. 2017. Intel Pentium III Xeon Processor 667 MHz, 256K Cache, 133 MHz FSB. Retrieved August 24, 2017 from http://ark.intel.com/products/27566/Intel-Pentium-III-Xeon-Processor-667-MHz-256K-Cache-133-MHz-FSB.Google ScholarGoogle Scholar
  15. Engin Ipek, Sally A. McKee, Karan Singh, Rich Caruana, Bronis R. de Supinski, and Martin Schulz. 2008. Efficient architectural design space exploration via predictive modeling. ACM Trans. Arch. Code Optimiz. 4, 4 (2008), 1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Daniel A. Jiménez. 2003. Fast path-based neural branch prediction. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 243.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Daniel A. Jiménez and Calvin Lin. 2002. Neural methods for dynamic branch prediction. ACM Trans. Comput. Syst. 20, 4 (2002), 369--397.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Songchok Khakhaeng and Chantana Chantrapornchai. 2016. On the finding proper cache prediction model using neural network. In Proceedings of the 2016 8th International Conference on Knowledge and Smart Technology (KST’16). IEEE, 146--151.Google ScholarGoogle ScholarCross RefCross Ref
  19. Hugh Leather, Edwin Bonilla, and Michael O’Boyle. 2009. Automatic feature generation for machine learning based optimizing compilation. In Proceedings of the 7th Annual IEEE/ACM International Symposium on Code Generation and Optimization. IEEE Computer Society, 81--91.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jung-Hoon Lee, Shin-Dug Kim, and Charles Weems. 2002. Application-adaptive intelligent cache memory system. ACM Trans. Embed. Comput. Syst. 1, 1 (2002), 56--78.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Tim Leiding. 2015. Adaptive Cache for Soft Real-Time Systems with no Reliance on Offline Processing. Master’s thesis. Ruhr University Bochum, Bochum, Germany.Google ScholarGoogle Scholar
  22. Yun Liang and Tulika Mitra. 2010. Instruction cache locking using temporal reuse profile. In Proceedings of the 47th Design Automation Conference. ACM, 344--349.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Yun Liang and Tulika Mitra. 2013. An analytical approach for fast and accurate design space exploration of instruction caches. ACM Trans. Embed. Comput. Syst. 13, 3 (2013), 43.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building customized program analysis tools with dynamic instrumentation. In ACM SIGPLAN Notices, Vol. 40. ACM, 190--200.Google ScholarGoogle Scholar
  25. Osvaldo Navarro and Michael Hübner. 2014. An adaptive victim cache scheme. In Proceedings of the 2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig’14). IEEE, 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  26. Osvaldo Navarro and Michael Hübner. 2018. Runtime adaptive cache for the LEON3 processor. In Proceedings of the Conference on Applied Reconfigurable Computing. Architectures, Tools, and Applications (ARC 2018). Lecture Notes in Computer Science, Vol. 10824. Springer.Google ScholarGoogle ScholarCross RefCross Ref
  27. Osvaldo Navarro, Tim Leiding, and Michael Hübner. 2015. Configurable cache tuning with a victim cache. In Proceedings of the 10th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC’15). IEEE, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  28. Osvaldo Navarro, Tim Leiding, and Michael Hübner. 2016. A dynamic cache reconfiguration platform for soft real-time systems. In Proceedings of the IEEE International Conference on Electronics, Circuits and Systems (ICECS’16). IEEE, 388--391.Google ScholarGoogle ScholarCross RefCross Ref
  29. Osvaldo Navarro, Jones Mori, Javier Hoffmann, Fabian Stuckmann, and Michael Hübner. 2017. A machine learning methodology for cache recommendation. In Proceedings of the International Symposium on Applied Reconfigurable Computing. Springer, 311--322.Google ScholarGoogle ScholarCross RefCross Ref
  30. Keni Qiu, Mengying Zhao, Chun Jason Xue, and Alex Orailoglu. 2014. Branch prediction-directed dynamic instruction cache locking for embedded systems. ACM Trans. Embed. Comput. Syst. 13, 5s (2014), 156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Marisha Rawlins and Ann Gordon-Ross. 2013. Adaptive loop caching using lightweight runtime control flow analysis. ACM Trans. Embed. Comput. Syst. 12, 1s (2013), 55.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Kevin P. Murphy. 2012. Machine learning: A Probabilistic Perspective. MIT press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Andreas Sembrant, David Eklov, and Erik Hagersten. 2011. Efficient software-based online phase classification. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC’11). IEEE, 104--115.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Timothy Sherwood, Erez Perelman, and Brad Calder. 2001. Basic block distribution analysis to find periodic behavior and simulation points in applications. In Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques. IEEE, 3--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Timothy Sherwood, Erez Perelman, Greg Hamerly, and Brad Calder. 2002. Automatically characterizing large scale program behavior. ACM SIGARCH Comput. Arch. News 30, 5 (2002), 45--57.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Bruno A. Silva, Lucas A. Cuminato, Vanderlei Bonato, and Pedro C. Diniz. 2015. Run-time cache configuration for the LEON-3 embedded processor. In Proceedings of the 28th Symposium on Integrated Circuits and Systems Design (SBCCI’15). ACM, New York, NY, Article 42, 6 pages. DOI:https://doi.org/10.1145/2800986.2801026Google ScholarGoogle Scholar
  37. Vasileios Spiliopoulos, Andreas Sembrant, and Stefanos Kaxiras. 2012. Power-sleuth: A tool for investigating your program’s power behavior. In Proceedings of the IEEE 20th International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS’12). IEEE, 241--250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Rabin A. Sugumar and Santosh G. Abraham. 1995. Set-associative cache simulation using generalized binomial trees. ACM Trans. Comput. Syst. 13, 1 (1995), 32--56.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. David Tarjan, Shyamkumar Thoziyoor, and Norman P. Jouppi. 2006. CACTI 4.0. Technical Report. Technical Report HPL-2006-86, HP Laboratories Palo Alto.Google ScholarGoogle Scholar
  40. John Thomson, Michael O’Boyle, Grigori Fursin, and Björn Franke. 2009. Reducing training time in a one-shot machine learning-based compiler. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing. Springer, 399--407.Google ScholarGoogle Scholar
  41. Florida State University. 2016. C Source Codes Benchmark. Retrieved June 12, 2017 from http://people.sc.fsu.edu/ jburkardt/c_src/c_src.html.Google ScholarGoogle Scholar
  42. Miguel A. Vega, Raúl Martín, Francisco A. Zarallo, Juan M. Sánchez, and Juan A. Gómez. 2000. SMPCache: Simulador de sistemas de memoria cache en multiprocesadores simétricos. In XI Jornadas de Paralelismo. Granada (2000).Google ScholarGoogle Scholar
  43. Weixun Wang, Prabhat Mishra, and A. Gordon-Ross. 2012. Dynamic cache reconfiguration for soft real-time systems. ACM Trans. Embed. Comput. Syst. 11, 2 (2012). DOI:https://doi.org/10.1145/0000000.0000000Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Yu Wang and Lei Chen. 2015. Dynamic Branch Prediction Using Machine Learning. ECS-201A Fall. Technical report. Department of Computer Science,University of Massachusetts, Amherst.Google ScholarGoogle Scholar
  45. Chuanjun Zhang and Frank Vahid. 2003. Cache configuration exploration on prototyping platforms. In Proceedings of the 14th IEEE International Workshop on Rapid Systems Prototyping 2003. IEEE, 164--170.Google ScholarGoogle ScholarCross RefCross Ref
  46. Chuanjun Zhang, Frank Vahid, and Roman Lysecky. 2004. A self-tuning cache architecture for embedded systems. ACM Trans. Embed. Comput. Syst. 3, 2 (2004), 407--425.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Chuanjun Zhang, Frank Vahid, and Walid Najjar. 2003. A highly configurable cache architecture for embedded systems. In Proceedings of the 30th Annual International Symposium on Computer Architecture, 2003. IEEE, 136--146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Chuanjun Zhang, Frank Vahid, and Walid Najjar. 2005. A highly configurable cache for low energy embedded systems. ACM Trans. Embed. Comput. Syst. 4, 2 (2005), 363--387.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Machine Learning Methodology for Cache Memory Design Based on Dynamic Instructions

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!