skip to main content
research-article
Open Access

Fusing industry and academia at GitHub (experience report)

Published:31 August 2022Publication History
Skip Abstract Section

Abstract

GitHub hosts hundreds of millions of code repositories written in hundreds of different programming languages. In addition to its hosting services, GitHub provides data and insights into code, such as vulnerability analysis and code navigation, with which users can improve and understand their software development process. GitHub has built Semantic, a program analysis tool capable of parsing and extracting detailed information from source code. The development of Semantic has relied extensively on the functional programming literature; this paper describes how connections to academic research inspired and informed the development of an industrial-scale program analysis toolkit.

References

  1. Patrick Bahr and Tom Hvitved. 2011. Compositional Data Types. In Proceedings of the Seventh ACM SIGPLAN Workshop on Generic Programming (WGP ’11). Association for Computing Machinery, New York, NY, USA. 83–94. isbn:9781450308618 https://doi.org/10.1145/2036918.2036930 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Max Brunsfeld. 2018. tree-sitter/tree-sitter: v0.20.4. https://doi.org/10.5281/zenodo.5894991 Google ScholarGoogle ScholarCross RefCross Ref
  3. Jacques Carette, Oleg Kiselyov, and Chung-chieh Shan. 2007. Finally Tagless, Partially Evaluated. In Programming Languages and Systems, 5th Asian Symposium, APLAS 2007, Singapore, November 29-December 1, 2007, Proceedings, Zhong Shao (Ed.) (Lecture Notes in Computer Science, Vol. 4807). Springer, 222–238. https://doi.org/10.1007/978-3-540-76637-7_15 Google ScholarGoogle ScholarCross RefCross Ref
  4. Douglas Creager. 2021. Introducing stack graphs | The GitHub Blog. https://github.blog/2021-12-09-introducing-stack-graphs/ Google ScholarGoogle Scholar
  5. David Darais, Nicholas Labich, Phúc C. Nguyen, and David Van Horn. 2017. Abstracting Definitional Interpreters (Functional Pearl). Proc. ACM Program. Lang., 1, ICFP (2017), Article 12, aug, 25 pages. https://doi.org/10.1145/3110256 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Allele Dev and Alexis King. 2016. freer-simple. https://github.com/lexi-lambda/freer-simple Google ScholarGoogle Scholar
  7. Richard A. Eisenberg, Stephanie Weirich, and Hamidhasan G. Ahmed. 2016. Visible Type Application. In Programming Languages and Systems, Peter Thiemann (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 229–254. isbn:978-3-662-49498-1 Google ScholarGoogle Scholar
  8. Jan P. Finis, Martin Raiber, Nikolaus Augsten, Robert Brunel, Alfons Kemper, and Franz Färber. 2013. RWS-Diff: Flexible and Efficient Change Detection in Hierarchical Data. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM ’13). Association for Computing Machinery, New York, NY, USA. 339–348. isbn:9781450322638 https://doi.org/10.1145/2505515.2505763 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Google. 2008. Protocol Buffers. https://developers.google.com/protocol-buffers Google ScholarGoogle Scholar
  10. Google. 2014. Kubernetes. https://kubernetes.io Google ScholarGoogle Scholar
  11. Stefan Haefliger, Georg von Krogh, and Sebastian Spaeth. 2008. Code Reuse in Open Source Software. Management Science, 54, 1 (2008), 180–193. https://doi.org/10.1287/mnsc.1070.0748 arxiv:https://doi.org/10.1287/mnsc.1070.0748. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mark P. Jones. 1995. Functional programming with overloading and higher-order polymorphism. In Advanced Functional Programming, Johan Jeuring and Erik Meijer (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 97–136. isbn:978-3-540-49270-2 Google ScholarGoogle Scholar
  13. Bernard Lang. 1974. Deterministic Techniques for Efficient Non-Deterministic Parsers. In Automata, Languages and Programming, Jacques Loeckx (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 255–269. isbn:978-3-662-21545-6 Google ScholarGoogle Scholar
  14. José Magalhães, Atze Dijkstra, Johan Jeuring, and Andres Löh. 2010. A Generic Deriving Mechanism for Haskell. Proceedings of the ACM SIGPLAN International Conference on Functional Programming, ICFP, 45, 37–48. https://doi.org/10.1145/1863523.1863529 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Simon Marlow. 2015. Fighting spam with Haskell. https://engineering.fb.com/2015/06/26/security/fighting-spam-with-haskell/ Google ScholarGoogle Scholar
  16. Simon Marlow, Louis Brandy, Jonathan Coens, and Jon Purdy. 2014. There is No Fork: An Abstraction for Efficient, Concurrent, and Concise Data Access. SIGPLAN Not., 49, 9 (2014), aug, 325–337. issn:0362-1340 https://doi.org/10.1145/2692915.2628144 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Erik Meijer, Maarten Fokkinga, and Ross Paterson. 1991. Functional programming with bananas, lenses, envelopes and barbed wire. In Functional Programming Languages and Computer Architecture, John Hughes (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 124–144. isbn:978-3-540-47599-6 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Eugene W. Myers. 1986. An O(ND) Difference Algorithm and Its Variations. Algorithmica, 1 (1986), 251–266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ayman Nadeem. 2020. CodeGen: Semantic’s improved language support system | The GitHub Blog. https://github.blog/2020-08-04-codegen-semantics-improved-language-support-system/ Google ScholarGoogle Scholar
  20. Simon Peyton-Jones and Simon Marlow. 2002. Secrets of the Glasgow Haskell Compiler Inliner. J. Funct. Program., 12, 5 (2002), jul, 393–434. issn:0956-7968 https://doi.org/10.1017/S0956796802004331 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Matthew Pickering, Jeremy Gibbons, and Nicolas Wu. 2017. Profunctor Optics: Modular Data Accessors. The Art, Science, and Engineering of Programming, 1 (2017), 03, https://doi.org/10.22152/programming-journal.org/2017/1/7 Google ScholarGoogle ScholarCross RefCross Ref
  22. Gordon Plotkin and John Power. 2001. Semantics for Algebraic Operations. Electronic Notes in Theoretical Computer Science, 45 (2001), 07, https://doi.org/10.1016/S1571-0661(04)80970-8 Google ScholarGoogle ScholarCross RefCross Ref
  23. Rob Rix. 2017. Quickly review changed methods and functions in your pull requests | The GitHub Blog. https://github.blog/2017-07-26-quickly-review-changed-methods-and-functions-in-your-pull-requests/ Google ScholarGoogle Scholar
  24. Tom Schrijvers, Maciej Piróg, Nicolas Wu, and Mauro Jaskelioff. 2019. Monad transformers and modular algebraic effects: what binds them together. 98–113. isbn:978-1-4503-6813-1 https://doi.org/10.1145/3331545.3342595 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Servant. 2014. https://github.com/haskell-servant/servant Google ScholarGoogle Scholar
  26. Wouter Swierstra. 2008. Data types à la carte. Journal of Functional Programming, 18, 4 (2008), 423–436. https://doi.org/10.1017/S0956796808006758 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Masaru Tomita. 1986. Efficient parsing for natural language: A fast algorithm for practical systems. Kluwer Academic. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Hendrik van Antwerpen, Casper Poulsen, Arjen Rouvoet, and Eelco Visser. 2018. Scopes as types. Proceedings of the ACM on Programming Languages, 2 (2018), 10, 1–30. https://doi.org/10.1145/3276484 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. David Van Horn and Matthew Might. 2010. Abstracting Abstract Machines. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming (ICFP ’10). Association for Computing Machinery, New York, NY, USA. 51–62. isbn:9781605587943 https://doi.org/10.1145/1863543.1863553 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Nicolas Wu and Tom Schrijvers. 2015. Fusion for Free - Efficient Algebraic Effect Handlers. In Mathematics of Program Construction - 12th International Conference, 2015. Proceedings, Ralf Hinze and Janis Voigtländer (Eds.) (Lecture Notes in Computer Science, Vol. 9129). Springer, 302–322. https://doi.org/10.1007/978-3-319-19797-5_15 Google ScholarGoogle ScholarCross RefCross Ref
  31. Nicolas Wu, Tom Schrijvers, and Ralf Hinze. 2014. Effect Handlers in Scope. In Proceedings of the 2014 ACM SIGPLAN Symposium on Haskell (Haskell ’14). Association for Computing Machinery, New York, NY, USA. 1–12. isbn:9781450330411 https://doi.org/10.1145/2633357.2633358 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fusing industry and academia at GitHub (experience report)

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the ACM on Programming Languages
        Proceedings of the ACM on Programming Languages  Volume 6, Issue ICFP
        August 2022
        959 pages
        EISSN:2475-1421
        DOI:10.1145/3554306
        Issue’s Table of Contents

        Copyright © 2022 Owner/Author

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 31 August 2022
        Published in pacmpl Volume 6, Issue ICFP

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
      • Article Metrics

        • Downloads (Last 12 months)337
        • Downloads (Last 6 weeks)45

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!