skip to main content
article
Public Access

The hardness of data packing

Published:11 January 2016Publication History
Skip Abstract Section

Abstract

A program can benefit from improved cache block utilization when contemporaneously accessed data elements are placed in the same memory block. This can reduce the program's memory block working set and thereby, reduce the capacity miss rate. We formally define the problem of data packing for arbitrary number of blocks in the cache and packing factor (the number of data objects fitting in a cache block) and study how well the optimal solution can be approximated for two dual problems. On the one hand, we show that the cache hit maximization problem is approximable within a constant factor, for every fixed number of blocks in the cache. On the other hand, we show that unless P=NP, the cache miss minimization problem cannot be efficiently approximated.

References

  1. Cortex57 technical reference manual. 2012. Available at http://www.arm.com/products/processors/cortex-a/cortex-a57processor.php.Google ScholarGoogle Scholar
  2. Cortex72 technical reference manual. 2015. Available at http://www.arm.com/products/processors/cortex-a/cortex-a72processor.php.Google ScholarGoogle Scholar
  3. A. Aggarwal, B. Alpern, A. Chandra, and M. Snir. A model for hierarchical memory. In Proceedings of the ACM Conference on Theory of Computing, pages 305–314, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Andreev and H. Räcke. Balanced graph partitioning. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 120–124, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. E. M. Arkin and R. Hassin. On local search for weighted k-set packing. In Proceedings of the 5th Annual European Symposium on Algorithms, pages 13–22, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. N. Bui and C. Jones. Finding good approximate vertex and edge partitions is NP-hard. Inf. Process. Lett., 42(3):153–159, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Calder, C. Krintz, S. John, and T. M. Austin. Cache-conscious data placement. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pages 139–149, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. F. Cantin and M. D. Hill. Cache performance for SPEC CPU2000 benchmarks. http://www.cs.wisc.edu/multifacet/misc/spec2000cachedata.Google ScholarGoogle Scholar
  9. B. Chandra and M. M. Halldórsson. Greedy local improvement and weighted set packing approximation. J. Algorithms, 39(2):223–240, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. M. Chilimbi, B. Davidson, and J. R. Larus. Cache-conscious structure definition. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 13–24, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. U. Feige. Relations between average case complexity and approximation complexity. In Proceedings of the 17th Annual IEEE Conference on Computational Complexity, pages 534–543, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. U. Feige and R. Krauthgamer. A polylogarithmic approximation of the minimum bisection. SIAM J. Comput., 31(4):1090–1118, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Fitzpatrick. Distributed caching with memcached. Linux journal, 2004(124):5, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cacheoblivious algorithms. In Proceedings of the Symposium on Foundations of Computer Science, pages 285–298, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. N. C. Gloy and M. D. Smith. Procedure placement using temporalordering information. ACM Transactions on Programming Languages and Systems, 21(5):977–1027, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. E. Gomory and T. C. Hu. Multi-terminal network flows. Journal of the Society for Industrial & Applied Mathematics, 9(4):551––570, 1961.Google ScholarGoogle ScholarCross RefCross Ref
  17. E. G. Hallnor and S. K. Reinhardt. A fully associative softwaremanaged cache design. In 27th International Symposium on Computer Architecture, pages 107–116, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. E. Hazan, S. Safra, and O. Schwartz. On the complexity of approximating k-set packing. Computational Complexity, 15(1):20–39, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P. Cheng. The garbage collection advantage: improving program locality. In Proceedings of the International Conference on Object-Oriented Programming, Systems, Languages and Applications, pages 69–80, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Huneycutt, J. B. Fryman, and K. M. Mackenzie. Software caching using dynamic binary rewriting for embedded devices. In 31st International Conference on Parallel Processing, pages 621–630, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Khot. On the power of unique 2-prover 1-round games. In Proceedings on 34th Annual ACM Symposium on Theory of Computing, pages 767–775, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. G. Kirkpatrick and P. Hell. On the completeness of a generalized matching problem. In Proceedings of the 10th Annual ACM Symposium on Theory of Computing, pages 240–245, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Li, H. Luo, C. Ding, Z. Hu, and H. Ye. Code layout optimization for defensiveness and politeness in shared cache. In Proceedings of the International Conference on Parallel Processing, pages 151–161, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. X. Liu, K. Sharma, and J. M. Mellor-Crummey. ArrayTool: a lightweight profiler to guide array regrouping. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, pages 405–416, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Q. Lu, J. Lin, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Soft-OLP: Improving hardware cache performance through softwarecontrolled object-level partitioning. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, pages 246–257, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. N. McIntosh, S. Mannarswamy, and R. Hundt. Whole-program optimization of global variable layout. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, pages 164–172, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. E. Petrank and D. Rawitz. The hardness of cache conscious data placement. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 101–112, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. K. Pettis and R. C. Hansen. Profile guided code positioning. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 16–27, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. K. Qureshi, D. Thompson, and Y. N. Patt. The V-way cache: Demand-based associativity via global replacement. In 32st International Symposium on Computer Architecture, pages 544–555, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. M. Rabbah and K. V. Palem. Data remapping for design space optimization of embedded memory systems. ACM Transactions in Embedded Computing Systems, 2(2), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. Rubin, R. Bodik, and T. Chilimbi. An efficient profile-analysis framework for data layout optimizations. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Saran and V. V. Vazirani. Finding k-cuts within twice the optimal. SIAM Journal on Computing, 24(1):101–108, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. L. Seidl and B. G. Zorn. Segregating heap objects by reference behavior and lifetime. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pages 12–23, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Seo, J. Lee, and Z. Sura. Design and implementation of softwaremanaged caches for multicores with local memory. In 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), pages 55–66, 2009.Google ScholarGoogle Scholar
  35. X. Shen, Y. Gao, C. Ding, and R. Archambault. Lightweight reference affinity analysis. In Proceedings of the International Conference on Supercomputing, pages 131–140, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. K. O. Thabit. Cache management by the compiler. PhD thesis, Rice University, Houston, TX, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. Yan, J. He, W. Chen, P.-C. Yew, and W. Zheng. ASLOP: A fieldaccess affinity-based structure data layout optimizer. SCIENCE CHINA Info. Sci., 54(9):1769–1783, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  38. C. Zhang, C. Ding, M. Ogihara, Y. Zhong, and Y. Wu. A hierarchical model of data locality. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 16–29, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. P. Zhao, S. Cui, Y. Gao, R. Silvera, and J. N. Amaral. Forma: A framework for safe automatic array reshaping. ACM Transactions on Programming Languages and Systems, 30(1):2, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Y. Zhong, M. Orlovich, X. Shen, and C. Ding. Array regrouping and structure splitting using whole-program reference affinity. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 255–266, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The hardness of data packing

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 51, Issue 1
        POPL '16
        January 2016
        815 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2914770
        • Editor:
        • Andy Gill
        Issue’s Table of Contents
        • cover image ACM Conferences
          POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
          January 2016
          815 pages
          ISBN:9781450335492
          DOI:10.1145/2837614

        Copyright © 2016 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 January 2016

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!