Abstract
Many applications require specialized data structures not found in the standard libraries, but implementing new data structures by hand is tedious and error-prone. This paper presents a novel approach for synthesizing efficient implementations of complex collection data structures from high-level specifications that describe the desired retrieval operations. Our approach handles a wider range of data structures than previous work, including structures that maintain an order among their elements or have complex retrieval methods. We have prototyped our approach in a data structure synthesizer called Cozy. Four large, real-world case studies compare structures generated by Cozy against handwritten implementations in terms of correctness and performance. Structures synthesized by Cozy match the performance of handwritten data structures while avoiding human error.
- S. Agrawal, S. Chaudhuri, and V. R. Narasayya. Automated selection of materialized views and indexes in sql databases. In Proceedings of the 26th International Conference on Very Large Data Bases, VLDB ’00, pages 496–505, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc. ISBN 1-55860-715-3. URL http://dl.acm.org/citation. cfm?id=645926.671701. Google Scholar
Digital Library
- Y. Ahmad, O. Kennedy, C. Koch, and M. Nikolic. Dbtoaster: Higher-order delta processing for dynamic, frequently fresh views. Proc. VLDB Endow., 5(10):968–979, June 2012. ISSN 2150-8097. Google Scholar
Digital Library
- doi: 10.14778/2336664.Google Scholar
- 2336670. URL http://dx.doi.org/10.14778/ 2336664.2336670.Google Scholar
- D. Batory, V. Singhal, and M. Sirkin. Implementing a domain model for data structures, 1992.Google Scholar
- J. Bornholt, E. Torlak, D. Grossman, and L. Ceze. Optimizing synthesis with metasketches. In Proceedings of the 43rd ACM SIGACT-SIGPLAN Symposium on Principles on Programming Languages, POPL ’16, 2016. Google Scholar
Digital Library
- C. Boyapati, S. Khurshid, and D. Marinov. Korat: Automated testing based on java predicates. In Proceedings of the 2002 ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA ’02, pages 123–133, New York, NY, USA, 2002. ACM. ISBN 1-58113-562-9. doi: 10.1145/566172.566191. URL http://doi. acm.org/10.1145/566172.566191. Google Scholar
Digital Library
- Bullet. The Bullet physics library. http: //bulletphysics.org (Retrieved October 29, 2015).Google Scholar
- J. Cai, P. Facon, F. Henglein, R. Paige, and E. Schonberg. Type transformation and data structure choice. In Constructing Programs From Specifications, pages 126–124. North-Holland, 1991.Google Scholar
- CDTestFramework. CDTestFramework. http: //www.bulletphysics.org/mediawiki-1.5. 8/index.php/CDTestFramework (Retrieved October 29, 2015).Google Scholar
- S. Chaudhuri and V. R. Narasayya. An efficient cost-driven index selection tool for microsoft sql server. In Proceedings of the 23rd International Conference on Very Large Data Bases, VLDB ’97, pages 146–155, San Francisco, CA, USA, 1997. Morgan Kaufmann Publishers Inc. ISBN 1-55860-470- 7. URL http://dl.acm.org/citation.cfm?id= 645923.673646. Google Scholar
Digital Library
- Cozy. http://cozy.uwplse.org.Google Scholar
- B. Daniel, D. Dig, K. Garcia, and D. Marinov. Automated testing of refactoring engines. In Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, ESEC-FSE ’07, pages 185– 194, New York, NY, USA, 2007. ACM. ISBN 978-1-59593- 811-4. doi: 10.1145/1287624.1287651. URL http: //doi.acm.org/10.1145/1287624.1287651. Google Scholar
Digital Library
- L. De Moura and N. Bjørner. Z3: An efficient smt solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS’08/ETAPS’08, pages 337–340, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 3- 540-78799-2, 978-3-540-78799-0. URL http://dl.acm. org/citation.cfm?id=1792734.1792766. Google Scholar
Digital Library
- J. Earley. High level operations in automatic programming. In Proceedings of the ACM SIGPLAN Symposium on Very High Level Languages, pages 34–42, New York, NY, USA, 1974. ACM. doi: 10.1145/800233.807043. URL http: //doi.acm.org/10.1145/800233.807043. Google Scholar
Digital Library
- J. K. Feser, S. Chaudhuri, and I. Dillig. Synthesizing data structure transformations from input-output examples. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2015, pages 229–239, New York, NY, USA, 2015. ACM. ISBN 978-1-4503- 3468-6. doi: 10.1145/2737924.2737977. URL http: //doi.acm.org/10.1145/2737924.2737977. Google Scholar
Digital Library
- A. C. Fong and J. D. Ullman. Induction variables in very high level languages. In Proceedings of the 3rd ACM SIGACTSIGPLAN Symposium on Principles on Programming Languages, POPL ’76, pages 104–112, New York, NY, USA, 1976. ACM. doi: 10.1145/800168.811544. URL http://doi.acm.org/10.1145/800168.811544. Google Scholar
Digital Library
- M. Gligoric, T. Gvero, V. Jagannath, S. Khurshid, V. Kuncak, and D. Marinov. Test generation through programming in udita. In Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE ’10, pages 225– 234, New York, NY, USA, 2010. ACM. ISBN 978-1-60558- 719-6. doi: 10.1145/1806799.1806835. URL http: //doi.acm.org/10.1145/1806799.1806835. Google Scholar
Digital Library
- P. Hawkins, A. Aiken, K. Fisher, M. Rinard, and M. Sagiv. Data representation synthesis. In Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’11, pages 38–49, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0663-8. doi: 10.1145/ 1993498.1993504. URL http://doi.acm.org/10. 1145/1993498.1993504. Google Scholar
Digital Library
- P. Hawkins, A. Aiken, K. Fisher, M. Rinard, and M. Sagiv. Concurrent data representation synthesis. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, pages 417– 428, New York, NY, USA, 2012. ACM. ISBN 978-1-4503- 1205-9. doi: 10.1145/2254064.2254114. URL http: //doi.acm.org/10.1145/2254064.2254114. Google Scholar
Digital Library
- D. Marinov and S. Khurshid. Testera: A novel framework for automated testing of java programs. In Proceedings of the 16th IEEE International Conference on Automated Software Engineering, ASE ’01, pages 22–, Washington, DC, USA, 2001. IEEE Computer Society. URL http://dl.acm. org/citation.cfm?id=872023.872551. Google Scholar
Digital Library
- D. Moritz, D. Halperin, B. Howe, and J. Heer. Perfopticon: Visual query analysis for distributed databases. Computer Graphics Forum (Proc. EuroVis), 34(3), 2015. URL http:// idl.cs.washington.edu/papers/perfopticon. Google Scholar
Digital Library
- Myria. Myria distributed database. http://myria.cs. washington.edu (Retrieved April 10, 2015).Google Scholar
- R. Paige and S. Koenig. Finite differencing of computable expressions. ACM Trans. Program. Lang. Syst., 4 (3):402–454, July 1982. ISSN 0164-0925. doi: 10.1145/ 357172.357177. URL http://doi.acm.org/10. 1145/357172.357177. Google Scholar
Digital Library
- Sat4J. Sat4J boolean reasoning library. https://www. sat4j.org (Retrieved February 3, 2016).Google Scholar
- SatCompetition. The international SAT competition. http: //www.satcompetition.org/ (Retrieved February 3, 2016).Google Scholar
- E. Schonberg, J. T. Schwartz, and M. Sharir. An automatic technique for selection of data representations in setl programs. ACM Trans. Program. Lang. Syst., 3(2):126–143, Apr. 1981. ISSN 0164-0925. doi: 10.1145/357133.357135. URL http://doi.acm.org/10.1145/357133.357135. Google Scholar
Digital Library
- J. T. Schwartz. Automatic data structure choice in a language of very high level. In Proceedings of the 2Nd ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, POPL ’75, pages 36–40, New York, NY, USA, 1975. ACM. doi: 10.1145/512976.512981. URL http://doi.acm.org/10.1145/512976.512981. Google Scholar
Digital Library
- O. Shacham, M. Vechev, and E. Yahav. Chameleon: Adaptive selection of collections. In Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’09, pages 408–418, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-392-1. doi: 10.1145/ 1542476.1542522. URL http://doi.acm.org/10. 1145/1542476.1542522. Google Scholar
Digital Library
- R. Singh and A. Solar-Lezama. Synthesizing data structure manipulations from storyboards. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, ESEC/FSE ’11, pages 289–299, New York, NY, USA, 2011. ACM. ISBN 978-1-4503- 0443-6. doi: 10.1145/2025113.2025153. URL http: //doi.acm.org/10.1145/2025113.2025153. Google Scholar
Digital Library
- M. Sirkin, D. Batory, and V. Singhal. Software components in a data structure precompiler. In Proceedings of the 15th International Conference on Software Engineering, ICSE ’93, pages 437–446, Los Alamitos, CA, USA, 1993. IEEE Computer Society Press. ISBN 0-89791-588-7. URL http:// dl.acm.org/citation.cfm?id=257572.257671. Google Scholar
Digital Library
- Y. Smaragdakis and D. Batory. Distil: a transformation library for data structures. In In USENIX Conference on Domain-Specific Languages, pages 257–270, 1997. Google Scholar
Digital Library
- A. Solar-Lezama. Program Synthesis by Sketching. PhD thesis, University of California at Berkeley, Berkeley, CA, USA, 2008. Google Scholar
Digital Library
- AAI3353225.Google Scholar
- A. Solar-Lezama, L. Tancau, R. Bodik, S. Seshia, and V. Saraswat. Combinatorial sketching for finite programs. SIGOPS Oper. Syst. Rev., 40(5):404–415, Oct. 2006. ISSN 0163-5980. Google Scholar
Digital Library
- doi: 10.1145/1168917.Google Scholar
- 1168907. URL http://doi.acm.org/10.1145/ 1168917.1168907.Google Scholar
- A. Udupa, A. Raghavan, J. V. Deshmukh, S. Mador-Haim, M. M. Martin, and R. Alur. TRANSIT: Specifying protocols with concolic snippets. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, pages 287–296, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2014-6. doi: 10.1145/ 2491956.2462174. URL http://doi.acm.org/10. 1145/2491956.2462174. Google Scholar
Digital Library
- ZTopo. ZTopo topographic map viewer. https: //hawkinsp.github.io/ZTopo/ (Retrieved May 8, 2015). Introduction Overview Implementation Alterations Approach Synthesis of Outlines Representation Selection Code Generation Implementation Details Evaluation Subject Programs Integration Methodology Correctness Performance Related Work ConclusionGoogle Scholar
Index Terms
Fast synthesis of fast collections
Recommendations
The Data Calculator: Data Structure Design and Cost Synthesis from First Principles and Learned Cost Models
SIGMOD '18: Proceedings of the 2018 International Conference on Management of DataData structures are critical in any data-driven scenario, but they are notoriously hard to design due to a massive design space and the dependence of performance on workload and hardware which evolve continuously. We present a design engine, the Data ...
Fast synthesis of fast collections
PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and ImplementationMany applications require specialized data structures not found in the standard libraries, but implementing new data structures by hand is tedious and error-prone. This paper presents a novel approach for synthesizing efficient implementations of ...
Cozy: synthesizing collection data structures
FSE 2016: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software EngineeringMany applications require specialized data structures not found in standard libraries. Implementing new data structures by hand is tedious and error-prone. To alleviate this difficulty, we built a tool called Cozy that synthesizes data structures using ...







Comments