skip to main content
article

Introspective 3D chips

Published:20 October 2006Publication History
Skip Abstract Section

Abstract

While the number of transistors on a chip increases exponentially over time, the productivity that can be realized from these systems has not kept pace. To deal with the complexity of modern systems, software developers are increasingly dependent on specialized development tools such as security profilers, memory leak identifiers, data flight recorders, and dynamic type analysis. Many of these tools require full-system data which covers multiple interacting threads, processes, and processors. Reducing the performance penalty and complexity of these software tools is critical to those developing next generation applications, and many researchers have proposed adding specialized hardware to assist in profiling and introspection. Unfortunately, while this additional hardware would be incredibly beneficial to developers, the cost of this hardware must be paid on every single die that is manufactured.In this paper, we argue that a new way to attack this problem is with the addition of specialized analysis hardware built on separate active layers stacked vertically on the processor die using 3D IC technology. This provides a modular "snap-on" functionality that could be included with developer systems, and omitted from consumer systems to keep the cost impact to a minimum. In this paper we describe the advantage of using inter-die vias for introspection and we quantify the impact they can have in terms of the area, power, temperature, and routability of the resulting systems. We show that hardware stubs could be inserted into commodity processors at design time that would allow analysis layers to be bonded to development chips, and that these stubs would increase area and power by no more than 0.021mm2 and 0.9% respectively.

References

  1. International Technology Roadmap for Semiconductors, 2001.Google ScholarGoogle Scholar
  2. Workshop on Hardware Performance Monitor Design and Functionality in conjunction with HPCA-11, 2005.Google ScholarGoogle Scholar
  3. N. Goldsman A. Akturk and G.Metze. Self-Consistent Modeling of Heating and MOSFET Performance in 3-D Integrated Circuits. IEEE Transactions on Electron Devices, 52(11):2395--2403, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  4. Cristinel Ababei, Yan Feng, Brent Goplen, Hushrav Mogal, Tianpei Zhang, Kia Bazargan, and Sachin Sapatnekar. Placement and Routing in 3D Integrated Circuits. IEEE Design and Test of Computers, 22(6):520--531, Nov/Dec 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Computer Industry Almanac. http://www.c-i-a.com. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Anderson, W. Weihl, L. Berc, J. Dean, S. Ghemawat, M. Henziger, S. Leung, R. Sites, M. Vandevoorde, and C. Waldspurger. Continuous Profiling: Where Have All the Cycles Gone? ACM Transactions on Computer Systems (TOCS), 15(4):357--390, November 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Banerjee, S-C. Lin, A. Keshavarzi, S. Narendra, and V. De. A Self-Consistent Junction Temperature Estimation Methodology for Nanometer scale ICs with Implications for Performance and Thermal Management. In IEEE International Electron Devices Meeting (IEDM), pages 887--890, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  8. Kaustav Banerjee, Shukri J. Souri, Pawan Kapur, and Krishna C. Saraswat. 3-d ics: A Novel Chip Design for Improving Deep Submicron Interconnect Performance and Systems-on-Chip Integration. Proceedings of the IEEE, 89(5):602--633, May 2001.Google ScholarGoogle ScholarCross RefCross Ref
  9. Benkart et al. 3D Chip Stack Technology using Through-chip Interconnects. IEEE Design and Test of Computers, 22(6):512--518, Nov/Dec 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Shekhar Borkar. Design challenges of Technology Scaling. IEEE Micro, 19(4):23--29, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Adam Butts and Gurindar S. Sohi. A Static Power Model for Architects. In MICRO 33: Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, pages 191--201, New York, NY, USA, 2000. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Lawrence T. Clark, E.J. Hoffman, J. Miller, M. Biyani, Y. Liao, S. Strazdus, M. Morrow, K.E. Velarde, and M.A. Yarch. An embedded 32-b microprocessor core for low-power and highperformance applications. volume 36, pages 1599--1608, November 2001.Google ScholarGoogle Scholar
  13. T.M. Conte, B.A. Patel, and J.S. Cox. Using Branch Handling Hardware to Support Profile-driven Optimization. In Proceedings of the International symposium on Microarchitecture, pages 12--21, November 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T.M. Conte, M. Kishore N., and M. Ann Hirsch. Accurate and Practical Profile-driven Compilation using the Profile Buffer. In Proceedings of the 29th Annual International Symposium on Microarchitecture, December 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Marc L. Corliss, E Christopher Lewis, and Amir Roth. Dise: A Programmable Macro Engine for Customizing Applications. In Proceedings of the Thirtieth International Symposium on Computer Architecture (ISCA-30), June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Marc L. Corliss, E Christopher Lewis, and Amir Roth. Low-overhead Debugging via Flexible Dynamic Instrumentation via Dise. In Proceedings of the Eleventh International Symposium on High-Performance Computer Architecture (HPCA-11), pages 303--314, February 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Digital Equipment Corporation. Alpha 21164 Microprocessor Hardware Reference Manual. 1995.Google ScholarGoogle Scholar
  18. Intel Corporation. Pentium(r) Pro Processor Developer's Manual. In McGraw-Hill, June 1997.Google ScholarGoogle Scholar
  19. Jedidiah R. Crandall and Frederic T. Chong. Minos: Control Data Attack Prevention Orthogonal to Memory Model. In MICRO 37: Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, pages 221--232, Washington, DC, USA, 2004. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Davis et al. Demystifying 3D ICs: The pros and cons of going Vertical. IEEE Design and Test of Computers, 22(6):498--510, Nov/Dec 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, and George Z. Chrysos. ProfileMe : Hardware support for instruction-level profiling on out-of-order processors. In International Symposium on Microarchitecture, pages 292--302, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Douglas and H.H. Rachford. On the numerical solution of heat conduction problems in two or three space variables. Transactions on American Mathematical Society, pages 421--439, 1956.Google ScholarGoogle ScholarCross RefCross Ref
  23. Timothy Heil and James E. Smith. Relational Profiling: Enabling Thread-level Parallelism in Virtual Machines. In MICRO 33: Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, pages 281-290, New York, NY, USA, 2000. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. MIPS Technologies Inc. MIPS R10000 Microprocessor User's Manual. 1995.Google ScholarGoogle Scholar
  25. Canturk Isci and Margaret Martonosi. Runtime power monitoring in high-end processors: Methodology and empirical data. In MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, page 93, Washington, DC, USA, 2003. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Philip Jacob, Okan Erdogan, Aamir Zia, Paul M. Belemjian, Russell P. Kraft, and John F. McDonald. "Predicting the performance of a 3D processor-memory chip stack". IEEE Design and Test of Computers, 22(6):540--547, Nov/Dec 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Trevor Jim, Greg Morrisett, Dan Grossman, Michael Hicks, James Cheney, and Yanling Wang. Cyclone: A safe dialect of C. In USENIX Annual Technical Conference, June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Michael B. Kleiner, Stefan A. Kühn, and Werner Weber. Performance improvement of the memory hierarchy of RISC systems by applications of 3-D technology. In ISCAS, pages 2305--2308, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  29. Rajesh Kumar. Interconnect and noise immunity design for the Pentium 4 processor. In DAC '03: Proceedings of the 40th conference on Design automation, pages 938--943, New York, NY, USA, 2003. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kyeong Jae Lee and Kevin Skadron. Using performance counters for runtime temperature sensing in high-performance processors. In 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05), April 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Gian Luca Loi, Banit Agrawal, Navin Srivastava, Sheng-Chih Lin, Timothy Sherwood, and Kaustav Banerjee. A Thermally-Aware Performance Analysis of Vertically Integrated (3-D) Processor-Memory Hierarchy In Proceedings of the 43nd Design Automation Conference (DAC), June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Christianto C. Liu, Ilya Ganusov, Martin Burtscher, and Sandip Tiwari. Bridging the processor-memory performance gap with 3D IC technology. IEEE Design Test, 22(6):556--564, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Mamidipaka and Nikil Dutt. eCACTI: An Enhanced Power Model for On-chip Caches. Technical Report CECS TR-04-28, September 2004.Google ScholarGoogle Scholar
  34. Claude Massit and Nicolas Gerard. Three-dimensional multichip module United States Patents, US 5373189, December 1994.Google ScholarGoogle Scholar
  35. Miura et al. A 195gb/s 1.2w 3D-stacked inductive inter-chip wireless superconnect with transmit power control scheme. In IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, pages 264--265, Feb 2005.Google ScholarGoogle ScholarCross RefCross Ref
  36. Satish Narayanasamy, Gilles Pokam, and Brad Calder. Bugnet: Continuously recording program execution for deterministic replay debugging. In 32nd Annual International Symposium on Computer Architecture (ISCA'05), pages 284--295, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. K. Narbos and J. White. Fastcap: A multipole accelerated 3D capacitance extraction program. IEEE Trans. on CAD, 10(11):1447--1459, 1991.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. George C. Necula, Scott McPeak, and Westley Weimer. Ccured: Type-safe retrofitting of legacy code. In POPL '02: Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 128--139, New York, NY, USA, 2002. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M.N. Ozisik. Boundary value problems of heat conduction, 2002.Google ScholarGoogle Scholar
  40. D.W. Peaceman and H.H. Rachford. The numerical solution of parabolic and elliptic differential equations. Journal of the Society for Industrial and Applied Mathematics (SIAM), pages 28--41, 1995.Google ScholarGoogle Scholar
  41. R.V. Peri, S. Jinturkar, and L. Fajardo. A Novel Technique for Profiling Programs in Embedded Systems. In ACM Workshop on Feedback-Directed and Dynamic Optimization, 1999.Google ScholarGoogle Scholar
  42. Kiran Puttaswamy and Gabriel H. Loh. Implementing caches in a 3D technology for high performance processors. newblock In IEEE International Conference on Computer Design (ICCD) 2006, pages 525--532, October 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Kevin Skadron, Mircea R. Stan, Wei Huang, Sivakumar Velusamy, Karthik Sankaranarayanan, and David Tarjan. Temperature-aware microarchitecture. In ISCA, pages 2--13. IEEE Computer Society, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. G. Edward Suh, Jae W. Lee, David Zhang, and Srinivas Devadas. Secure Program Execution via Dynamic Information Flow Tracking. In ASPLOS-XI: Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, pages 85--96, New York, NY, USA, 2004. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Yuh-Fang Tsai, Yuan Xie, N. Vijaykrishnan, and Mary Jane Irwin. Three-dimensional cache design exploration using 3DCacti. In IEEE International Conference on Computer Design. IEEE, October 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Kapil Vaswani, Matthew J. Thazhuthaveetil, and Y.N. Srikant. A Programmable Hardware Path Profiler. In CGO '05: Proceedings of the international symposium on Code generation and optimization, pages 217---228, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Emmett Witchel, Josh Cates, and Krste Asanovic. Mondrian memory protection. In ASPLOS-X: Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 304--316, New York, NY, USA, 2002. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Emmett Witchel, Junghwan Rhee, and Krste Asanovic. Mondrix: memory isolation for linux using mondriaan memory protection. In SOSP '05: Proceedings of the twentieth ACM symposium on Operating systems principles, pages 31--44, New York, NY, USA, 2005. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Min Xu, Rastislav Bodik, and Mark D. Hill. A "Flight Data Recorder" for enabling full-system multiprocessor deterministic replay. In ISCA '03: Proceedings of the 30th Annual International Symposium on Computer Architecture, pages 122--135, New York, NY, USA, 2003. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Suan Hsi Yong and Susan Horwitz. Protecting C programs from attacks via invalid pointer dereferences. In ESEC/FSE-11: Proceedings of the 9th European software engineering conference held jointly with 11th ACM SIGSOFT international symposium on Foundations of software engineering, pages 307--316, New York, NY, USA, 2003. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Annie Zeng, James Lu, Kenneth Rose, and Ronald J. Gutmann. "Firstorder performance prediction of cache memory with wafer-level3d integration. IEEE Design and Test of Computers, 22(6):548--555, Nov/Dec 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Craig B. Zilles and Gurindar S. Sohi. A Programmable Co-processor for Profiling. In Proceedings of the 7th International Symposium on High Performance Computer Architecture, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Introspective 3D chips

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGOPS Operating Systems Review
      ACM SIGOPS Operating Systems Review  Volume 40, Issue 5
      Proceedings of the 2006 ASPLOS Conference
      December 2006
      425 pages
      ISSN:0163-5980
      DOI:10.1145/1168917
      Issue’s Table of Contents
      • cover image ACM Conferences
        ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
        October 2006
        440 pages
        ISBN:1595934510
        DOI:10.1145/1168857

      Copyright © 2006 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 October 2006

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!