skip to main content
research-article

How scale affects structure in Java programs

Published:23 October 2015Publication History
Skip Abstract Section

Abstract

Many internal software metrics and external quality attributes of Java programs correlate strongly with program size. This knowledge has been used pervasively in quantitative studies of software through practices such as normalization on size metrics. This paper reports size-related super- and sublinear effects that have not been known before. Findings obtained on a very large collection of Java programs -- 30,911 projects hosted at Google Code as of Summer 2011 -- unveils how certain characteristics of programs vary disproportionately with program size, sometimes even non-monotonically. Many of the specific parameters of nonlinear relations are reported. This result gives further insights for the differences of ``programming in the small'' vs. ``programming in the large.'' The reported findings carry important consequences for OO software metrics, and software research in general: metrics that have been known to correlate with size can now be properly normalized so that all the information that is left in them is size-independent.

Skip Supplemental Material Section

Supplemental Material

References

  1. S. Bajracharya, J. Ossher, and C. Lopes. Sourcerer: An infrastructure for large-scale collection and analysis of open-source code. Science of Computer Programming, 79:241 – 259, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Experimental Software and Toolkits (EST 4): A special issue of the Workshop on Academic Software Development Tools and Techniques (WASDeTT-3 2010).Google ScholarGoogle Scholar
  3. G. Baxter, M. Frean, J. Noble, M. Rickerby, H. Smith, M. Visser, H. Melton, and E. Tempero. Understanding the shape of Java software. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications, OOPSLA ’06, pages 397–412, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. C. Briand, J. Wst, J. W. Daly, and D. V. Porter. Exploring the relationships between design measures and software quality in object-oriented systems. Journal of Systems and Software, 51(3):245 – 273, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. O. Calla´u, R. Robbes, E. Tanter, and D. Röthlisberger. How developers use the dynamic features of programming languages: the case of smalltalk. In Proceedings of the 8th Working Conference on Mining Software Repositories, MSR ’11, pages 23–32, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Cartwright and M. Shepperd. An empirical investigation of an object-oriented software system. Software Engineering, IEEE Transactions on, 26(8):786–796, Aug 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. R. Chidamber and C. F. Kemerer. Towards a metrics suite for object oriented design. In Conference Proceedings on Object-oriented Programming Systems, Languages, and Applications, OOPSLA ’91, pages 197–211, New York, NY, USA, 1991. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Collberg, G. Myles, and M. Stepp. An empirical study of Java bytecode programs. Software: Practice and Experience, 37(6):581–641, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. DeRemer and H. Kron. Programming-in-the large versus programming-in-the-small. SIGPLAN Not., 10(6):114–121, Apr. 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. El Emam, S. Benlarbi, N. Goel, and S. Rai. The confounding effect of class size on the validity of object-oriented metrics. Software Engineering, IEEE Transactions on, 27(7): 630–650, Jul 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. Evanco. Comments on ”the confounding effect of class size on the validity of object-oriented metrics”. Software Engineering, IEEE Transactions on, 29(7):670–672, July 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. A. Fortuna, J. A. Bonachela, and S. A. Levin. Evolution of a modular software network. Proceedings of the National Academy of Sciences, 108(50):19985–19989, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  13. M. Gherardi, S. Mandr, B. Bassetti, and M. Cosentino Lagomarsino. Evidence for soft bounds in ubuntu package sizes and mammalian body masses. Proceedings of the National Academy of Sciences, 110(52):21054–21058, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  14. J. Gil and K. Lenz. The use of overloading in Java programs. In Proceedings of the 24th European conference on Objectoriented programming, ECOOP’10, pages 529–551, Berlin, Heidelberg, 2010. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Grechanik, C. McMillan, L. DeFerrari, M. Comi, S. Crespi, D. Poshyvanyk, C. Fu, Q. Xie, and C. Ghezzi. An empirical investigation into a large-scale Java open source code repository. In Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM ’10, pages 11:1–11:10, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Landman, A. Serebrenik, and J. Vinju. Empirical analysis of the relationship between cc and sloc in a large corpus of Java methods. In Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on, Sept 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Lopes and J. Ossher. Sourcerer datasets, 2012. URL http://sourcerer.ics.uci.edu.Google ScholarGoogle Scholar
  18. C. Lopes, J. Ossher, S. Bajracharya, and P. Ribeiro. Sourcerer project, 2015. URL https://github.com/Mondego/Sourcerer.Google ScholarGoogle Scholar
  19. P. Louridas, D. Spinellis, and V. Vlachos. Power laws in software. ACM Trans. Softw. Eng. Methodol., 18(1):2:1–2:26, Oct. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. D. McGregor and T. D. Korson. Introduction to the special issue. Comm. ACM, 33(9), Oct. 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. M. Meyers and D. Binkley. An empirical study of slicebased cohesion and coupling metrics. ACM Trans. Softw. Eng. Methodol., 17(1):2:1–2:27, Dec. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Muschevici, A. Potanin, E. Tempero, and J. Noble. Multiple dispatch in practice. In Proceedings of the 23rd ACM SIGPLAN Conference on Object-oriented Programming Systems Languages and Applications, OOPSLA ’08, pages 563–582, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. R. Myers. Software systems as complex networks: Structure, function, and evolvability of software collaboration graphs. Phys. Rev. E, 68:046116, Oct 2003.Google ScholarGoogle Scholar
  24. J. Ossher, S. Bajracharya, E. Linstead, P. Baldi, and C. Lopes. SourcererDB: An aggregated repository of statically analyzed and cross-linked open source Java projects. In Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories, MSR ’09, pages 183–186, Washington, DC, USA, 2009. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Potanin, J. Noble, M. Frean, and R. Biddle. Scale-free geometry in OO programs. Commun. ACM, 48(5):99–103, May 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. H. Sajnani, V. Saini, J. Ossher, and C. Lopes. Is popularity a measure of quality? An analysis of maven components. In Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on, pages 231–240, Sept 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. E. Tempero, J. Noble, and H. Melton. How do Java programs use inheritance? An empirical study of inheritance in Java software. In Proceedings of the 22nd European conference on Object-Oriented Programming, ECOOP ’08, pages 667–691, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. E. Tempero, C. Anslow, J. Dietrich, T. Han, J. Li, M. Lumpe, H. Melton, and J. Noble. Qualitas corpus: A curated collection of Java code for empirical studies. In 2010 Asia Pacific Software Engineering Conference (APSEC2010), pages 336– 345, Dec. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Valverde and R. V. Solé. Logarithmic growth dynamics in software networks. EPL (Europhysics Letters), 72(5):858, 2005.Google ScholarGoogle Scholar
  30. S. Valverde, R. Ferrer Cancho, and R. V. Solé. Scale-free networks from optimal design. EPL (Europhysics Letters), 60:512–517, Nov. 2002.Google ScholarGoogle Scholar
  31. X. Zheng, D. Zeng, H. Li, and F. Wang. Analyzing opensource software systems as complex networks. Physica A: Statistical Mechanics and its Applications, 387(24):6190 – 6200, 2008.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. How scale affects structure in Java programs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 50, Issue 10
      OOPSLA '15
      October 2015
      953 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2858965
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
      • cover image ACM Conferences
        OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications
        October 2015
        953 pages
        ISBN:9781450336895
        DOI:10.1145/2814270

      Copyright © 2015 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 October 2015

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!