Abstract
A dependence cluster is a set of program statements, all of which are mutually inter-dependent. This article reports a large scale empirical study of dependence clusters in C program source code. The study reveals that large dependence clusters are surprisingly commonplace. Most of the 45 programs studied have clusters of dependence that consume more than 10% of the whole program. Some even have clusters consuming 80% or more. The widespread existence of clusters has implications for source code analyses such as program comprehension, software maintenance, software testing, reverse engineering, reuse, and parallelization.
- Balmas, F. 2002. Using dependence graphs as a support to document programs. In Proceedings of the 2nd IEEE International Workshop on Source Code Analysis and Manipulation. IEEE Computer Society Press, Los Alamitos, CA, 145--154. Google Scholar
Digital Library
- Baresel, A., Sthamer, H., and Schmidt, M. 2002. Fitness function design to improve evolutionary structural testing. In GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference. Morgan Kaufmann, San Francisco, CA, 1329--1336. Google Scholar
Digital Library
- Bates, S. and Horwitz, S. 1993. Incremental program testing using program dependence graphs. In Conference Record of the 20th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York, 384--396. Google Scholar
Digital Library
- Beck, J. and Eichmann, D. 1993. Program and interface slicing for reverse engineering. In IEEE/ACM 15th Conference on Software Engineering (ICSE'93). IEEE Computer Society Press, Los Alamitos, CA, 509--518. Google Scholar
Digital Library
- Beszédes, A., Gergely, T., Szabó, Z. M., Csirik, J. and Gyimóthy, T. 2001. Dynamic slicing method for maintenance of large C programs. In Proceedings of the 5th European Conference on Software Maintenance and Reengineering (CSMR 2001). IEEE Computer Society Press, Los Alamitos, CA, 105--113. Google Scholar
Digital Library
- Beszédes, A. and Gyimóthy, T. 2002. Union slices for the approximation of the precise slice. In Proceedings of the IEEE International Conference on Software Maintenance. IEEE Computer Society Press, Los Alamitos, CA, 12--20.Google Scholar
- Bieman, J. M. and Ott, L. M. 1994. Measuring functional cohesion. IEEE Trans. Softw. Eng. 20, 8 (Aug.), 644--657. Google Scholar
Digital Library
- Binder, R. V. 1994. Design for testability in object--oriented systems. Commun. ACM 37, 9, 87--101. Google Scholar
Digital Library
- Binkley, D. and Harman, M. 2005a. Forward slices are smaller than backward slices. In Proceedings of the 5th IEEE International Workshop on Source Code Analysis and Manipulation. IEEE Computer Society Press, Los Alamitos, CA, 15--24. Google Scholar
Digital Library
- Binkley, D. and Harman, M. 2005b. Locating dependence clusters and dependence pollution. In Proceedings of the 21st IEEE International Conference on Software Maintenance. IEEE Computer Society Press, Los Alamitos, CA, 177--186. Google Scholar
Digital Library
- Binkley, D., Harman, M., and Krinke, J. 2006. Animated visualisation of static analysis: Characterising, explaining and exploiting the approximate nature of static analysis. In Proceedings of the 6th International Workshop on Source Code Analysis and Manipulation (SCAM 06). 43--52. Google Scholar
Digital Library
- Binkley, D. W. 1997. Semantics guided regression test cost reduction. IEEE Trans. Softw. Eng. 23, 8 (Aug.), 498--516. Google Scholar
Digital Library
- Binkley, D. W. 1998. The application of program slicing to regression testing. Inf. Softw. Tech. Special Issue on Program Slicing 40, 11 and 12, 583--594.Google Scholar
- Binkley, D. W. and Gallagher, K. B. 1996. Program slicing. In Advances in Computing, Volume 43, Academic Press, Orlando, FL, 1--50.Google Scholar
- Binkley, D. W. and Harman, M. 2004a. Analysis and visualization of predicate dependence on formal parameters and global variables. IEEE Trans. Softw. Eng. 30, 11, 715--735. Google Scholar
Digital Library
- Binkley, D. W. and Harman, M. 2004b. A survey of empirical results on program slicing. Adv. Comput. 62, 105--178.Google Scholar
Cross Ref
- Binkley, D. W., Harman, M., and Krinke, J. 2007. Empirical study of optimization techniques for massive slicing. ACM Trans. Prog. Lang. Syst. 30, 3:1--3:33. Google Scholar
Digital Library
- Binkley, D. W., Harman, M., Raszewski, L. R., and Smith, C. 2000. An empirical study of amorphous slicing as a program comprehension support tool. In Proceedings of the 8th IEEE International Workshop on Program Comprehension. IEEE Computer Society Press, Los Alamitos, CA, 161--170. Google Scholar
Digital Library
- Binkley, D. W., Horwitz, S., and Reps, T. 1995. Program integration for languages with procedure calls. ACM Trans. Softw. Eng. Method. 4, 1, 3--35. Google Scholar
Digital Library
- Black, S. E. 2001. Computing ripple effect for software maintenance. J. Softw. Mainten. Evolut. Res. Pract. 13, 263--279. Google Scholar
Digital Library
- Canfora, G. and Cerulo, L. 2005. Impact analysis by mining software and change request repositories. In Proceedings of the IEEE Metrics Symposium. IEEE Computer Society Press, Los Alamitos, CA, 29. Google Scholar
Digital Library
- Canfora, G., Cimitile, A., De Lucia, A., and Lucca, G. A. D. 1994a. Software salvaging based on conditions. In Proceedings of the International Conference on Software Maintenance IEEE Computer Society Press, Los Alamitos, CA, 424--433. Google Scholar
Digital Library
- Canfora, G., Cimitile, A., and Munro, M. 1994b. RE2: Reverse engineering and reuse re-engineering. J. Softw. Maint. Res. Pract. 6, 2, 53--72.Google Scholar
Cross Ref
- Canfora, G., De Lucia, A., and Munro, M. 1998. An integrated environment for reuse reengineering C code. J. Syst. Softw. 42, 153--164. Google Scholar
Digital Library
- Cimitile, A., De Lucia, A., and Munro, M. 1995. Identifying reusable functions using specification driven program slicing: a case study. In Proceedings of the IEEE International Conference on Software Maintenance. IEEE Computer Society Press, Los Alamitos, CA, 124--133. Google Scholar
Digital Library
- Cimitile, A., De Lucia, A., and Munro, M. 1996. A specification driven slicing process for identifying reusable functions. Software maintenance: Res. Pract. 8, 145--178. Google Scholar
Digital Library
- Colin, S., Legeard, B., and Peureux, F. 2004. Preamble computation in automated test case generation using constraint logic programming. Softw. Test. Verif. Reliab. 14, 3 (Sept.), 213--235. Google Scholar
Digital Library
- De Lucia, A. 2001. Program slicing: Methods and applications. In Proceedings of the 1st IEEE International Workshop on Source Code Analysis and Manipulation. IEEE Computer Society Press, Los Alamitos, CA, 142--149.Google Scholar
Cross Ref
- De Lucia, A., Fasolino, A. R., and Munro, M. 1996. Understanding function behaviours through program slicing. In Proceedings of the 4th IEEE Workshop on Program Comprehension. IEEE Computer Society Press, Los Alamitos, CA, 9--18. Google Scholar
Digital Library
- De Lucia, A., Harman, M., Hierons, R., and Krinke, J. 2003. Unions of slices are not slices. In Proceedings of the 7th IEEE European Conference on Software Maintenance and Reengineering (CSMR 2003). IEEE Computer Society Press, Los Alamitos, CA, 363--367. Google Scholar
Digital Library
- Deng, Y., Kothari, S., and Namara, Y. 2001. Program slice browser. In Proceedings of the 9th IEEE International Workshop on Program Comprenhesion. IEEE Computer Society Press, Los Alamitos, CA, 50--59. Google Scholar
Digital Library
- Eisenbarth, T., Koschke, R., and Simon, D. 2003. Locating features in source code. IEEE Trans. Softw. Eng. 29, 3. (Special issue on ICSM 2001.) Google Scholar
Digital Library
- Fahndrich, M., Foster, J. S., Su, Z., and Aiken, A. 1998. Partial online cycle elimination in inclusion constraint graphs. In Proceedings of the ACM SIGPLAN '98 Conference on Programming Language Design and Implementation. ACM, New York, 85--96. Google Scholar
Digital Library
- Ferguson, R. and Korel, B. 1996. The chaining approach for software test data generation. ACM Trans. Softw. Eng. Method. 5, 1 (Jan.), 63--86. Google Scholar
Digital Library
- Gallagher, K. and Binkley, D. 2003. An empirical study of computation equivalence as determined by decomposition slice equivalence. In Proceedings of the 10th Working Conference on Reverse Engineering, WCRE--03. IEEE Computer Society Press, Los Alamitos, CA, 316--322. Google Scholar
Digital Library
- Gallagher, K. and Layman, L. 2003. Are decomposition slices clones? In Proceedings of the 11th International Workshop on Program Comprehension. IEEE Computer Society Press, Los Alamitos, CA. Google Scholar
Digital Library
- Gallagher, K. and O'Brien, L. 2001. Analyzing programs via decomposition slicing. In Proceedings of International Workshop on Empirical Studies of Software Maintenance, WESS. IEEE Computer Society Press, Los Alamitos, CA.Google Scholar
- Gallagher, K. B. 1992. Evaluating the surgeon's assistant: Results of a pilot study. In Proceedings of the International Conference on Software Maintenance. IEEE Computer Society Press, Los Alamitos, CA, 236--244.Google Scholar
Cross Ref
- Gallagher, K. B. 1996. Visual impact analysis. In Proceedings of the Conference on Software Maintenance - 1996. IEEE Computer Society Press, Los Alamitos, CA. Google Scholar
Digital Library
- Gallagher, K. B., Harman, M., and Danicic, S. 2003. Guaranteed inconsistency avoidance during software evolution. J. Softw. Maint. Evolut. 15, 6 (Nov/Dec), 393--416. Google Scholar
Digital Library
- Gallagher, K. B. and Lyle, J. R. 1991. Using program slicing in software maintenance. IEEE Trans. Softw. Eng. 17, 8 (Aug.), 751--761. Google Scholar
Digital Library
- Grammatech Inc. 2002. The codesurfer slicing system.Google Scholar
- Gupta, R., Harrold, M. J., and Soffa, M. L. 1992. An approach to regression testing using slicing. In Proceedings of the IEEE Conference on Software Maintenance. IEEE Computer Society Press, Los Alamitos, CA, 299--308.Google Scholar
- Hall, T., Rainer, A., and Jagielska, D. 2005. Using software development progress data to understand threats to project outcomes. In Proceedings of the 11th International Software Metrics Symposium (METRICS 2005). IEEE Computer Society Press, Los Alamitos, CA, 18. Google Scholar
Digital Library
- Harman, M., Binkley, D., Singh, R., and Hierons, R. 2004a. Amorphous procedure extraction. In Proceedings of the 4th International Workshop on Source Code Analysis and Manipulation (SCAM 04). IEEE Computer Society Press, Los Alamitos, CA, 85--94. Google Scholar
Digital Library
- Harman, M., Binkley, D. W., and Danicic, S. 2003. Amorphous program slicing. J. Syst. Softw. 68, 1 (Oct.), 45--64. Google Scholar
Digital Library
- Harman, M. and Danicic, S. 1995. Using program slicing to simplify testing. Softw. Test. Verif. Reliab. 5, 3 (Sept.), 143--162.Google Scholar
Cross Ref
- Harman, M., Hassoun, Y., Lakhotia, K., McMinn, P., and Wegener, J. 2007. The impact of input domain reduction on search-based test data generation. In Proceedings of the ACM Symposium on the Foundations of Software Engineering (FSE '07). ACM, New York, 155--164. Google Scholar
Digital Library
- Harman, M. and Hierons, R. M. 2001. An overview of program slicing. Softw. Focus 2, 3, 85--92.Google Scholar
Cross Ref
- Harman, M., Hu, L., Hierons, R. M., Wegener, J., Sthamer, H., Baresel, A., and Roper, M. 2004b. Testability transformation. IEEE Trans. Softw. Eng. 30, 1 (Jan.), 3--16. Google Scholar
Digital Library
- Harman, M., Swift, S., and Mahdavi, K. 2005. An empirical study of the robustness of two module clustering fitness functions. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2005). ACM, New York, 1029--1036. Google Scholar
Digital Library
- Horwitz, S., Prins, J., and Reps, T. 1989. Integrating non--interfering versions of programs. ACM Trans. Prog. Lang. Syst. 11, 3 (July), 345--387. Google Scholar
Digital Library
- Horwitz, S., Reps, T., and Binkley, D. W. 1990. Interprocedural slicing using dependence graphs. ACM Trans. Prog. Lang. Syst. 12, 1, 26--61. Google Scholar
Digital Library
- Hutchens, D. and Basili, V. 1985. System structure analysis: clustering with data bindings. IEEE Trans. Softw. Eng. SE-11, 8, 749--757. Google Scholar
Digital Library
- Hutchens, M. and Gallagher, K. 1998. Improving visual impact analysis. In Proceedings of the 1998 International Conference on Software Maintenance--98. IEEE Computer Society Press, Los Alamitos, CA. Google Scholar
Digital Library
- Jackson, D. and Rollins, E. J. 1994. A new model of program dependences for reverse engineering. In Proceedings of the ACM SIGSOFT '94 Symposium on the Foundations of Software Engineering. ACM, New York, 2--10. Google Scholar
Digital Library
- Kamiya, T., Kusumoto, S., and Inoue, K. 2002. CCFinder: A multi-linguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28, 6, 654--670. Google Scholar
Digital Library
- Kiczales, G. 1997. Aspect oriented programming. ACM SIGPLAN Notices 32, 10 (Oct.), 162.Google Scholar
- Komondoor, R. and Horwitz, S. 2000. Semantics-preserving procedure extraction. In Proceedings of the 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL-00). ACM, New York, 155--169. Google Scholar
Digital Library
- Komondoor, R. and Horwitz, S. 2003. Effective automatic procedure extraction. In Proceedings of the 11th IEEE International Workshop on Program Comprehension. IEEE Computer Society Press, Los Alamitos, CA, 33--43. Google Scholar
Digital Library
- Korel, B. 1990. Automated software test data generation. IEEE Transactions on Software Engineering 16, 8, 870--879. Google Scholar
Digital Library
- Korel, B. and Rilling, J. 1997. Dynamic program slicing in understanding of program execution. In Proceedings of the 5th IEEE International Workshop on Program Comprenhesion (IWPC'97). IEEE Computer Society Press, Los Alamitos, CA, 80--89. Google Scholar
Digital Library
- Korel, B. and Rilling, J. 1998. Program slicing in understanding of large programs. In Proceedings of the 6th IEEE International Workshop on Program Comprenhesion (IWPC'98). IEEE Computer Society Press, Los Alamitos, CA, 145--152. Google Scholar
Digital Library
- Krinke, J. and Snelting, G. 1998. Validation of measurement software as an application of slicing and constraint solving. Inf. Softw. Tech. (Special Issue on Program Slicing) 40, 11 and 12, 661--675.Google Scholar
- Kusumoto, S., Nishimatsu, A., Nishie, K., and Inoue, K. 2002. Experimental evaluation of program slicing for fault localization. Empir. Softw. Eng. 7, 49--76. Google Scholar
Digital Library
- Lakhotia, A. 1993. Rule--based approach to computing module cohesion. In Proceedings of the 15th Conference on Software Engineering (ICSE-15). ACM, New York, 34--44. Google Scholar
Digital Library
- Lakhotia, A. and Deprez, J.-C. 1998. Restructuring programs by tucking statements into functions. Inf. Softw. Tech. (Special Issue on Program Slicing) 40, 11 and 12, 677--689.Google Scholar
- Lakhotia, A. and Singh, P. 2003. Challenges in getting formal with viruses. Virus Bulletin. Sept. 2003.Google Scholar
- Lehman, M. M. 1980. On understanding laws, evolution and conservation in the large program life cycle. J. Syst. Softw. 1, 3, 213--221.Google Scholar
Digital Library
- Lehman, M. M. 1998. Software's future: Managing evolution. IEEE Softw. 15, 1 (Jan./Feb.), 40--44. Google Scholar
Digital Library
- Li, K. and Wu, M. 2004. Effective Software Test Automation: Developing an Automated Software Testing Tool. Sybex. Google Scholar
Digital Library
- Lyle, J. R. and Weiser, M. 1987. Automatic program bug location by program slicing. In Proceedings of the 2nd International Conference on Computers and Applications (Peking) . IEEE Computer Society Press, Los Alamitos, CA, 877--882.Google Scholar
- Mahdavi, K., Harman, M., and Hierons, R. M. 2003. A multiple hill climbing approach to software module clustering. In Proceedings of the IEEE International Conference on Software Maintenance. (Amsterdam, Netherlands). IEEE Computer Society Press, Los Alamitos, CA, 315--324. Google Scholar
Digital Library
- Mancoridis, S., Mitchell, B. S., Chen, Y.-F., and Gansner, E. R. 1999. Bunch: A clustering tool for the recovery and maintenance of software system structures. In Proceedings of the IEEE International Conference on Software Maintenance. IEEE Computer Society Press, Los Alamitos, CA, 50--59. Google Scholar
Digital Library
- Meyers, T. and Binkley, D. W. 2004. Slice-based cohesion metrics and software intervention. In Proceedings of the 11th IEEE Working Conference on Reverse Engineering. IEEE Computer Society Press, Los Alamitos, CA, 256--266. Google Scholar
Digital Library
- Mitchell, B. S. and Mancoridis, S. 2002. Using heuristic search techniques to extract design abstractions from source code. In GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference. Morgan-Kaufmann, San Francisco, CA, 1375--1382. Google Scholar
Digital Library
- Mitchell, B. S. and Mancoridis, S. 2006. On the automatic modularization of software systems using the bunch tool. IEEE Trans. Softw. Eng. 32, 3, 193--208. Google Scholar
Digital Library
- Ning, J., Engberts, A., and Kozaczynski, V. 1994. Automated support for legacy code understanding. Commun. ACM 37, 5, 50--57. Google Scholar
Digital Library
- Ott, L. M. and Thuss, J. J. 1989. The relationship between slices and module cohesion. In Proceedings of the 11th ACM Conference on Software Engineering. ACM, New York, 198--204. Google Scholar
Digital Library
- Ren, X., Chesley, O., and Ryder, B. G. 2006. Identifying failure causes in java programs: An application of change impact analysis. IEEE Trans. Softw. Eng. 32, 9, 718--732. Google Scholar
Digital Library
- Ren, X., Ryder, B. G., Störzer, M., and Tip, F. 2005. Chianti: a change impact analysis tool for java programs. In Proceedings of the 27th International Conference on Software Engineering (ICSE 2005). ACM, New York, 664--665. Google Scholar
Digital Library
- Rilling, J., Seffah, A., and Lukas. J. 2001. MOOSE--A software comprehension framework. In Proceedings of the 5th World Multi-Conference on Systemics, Cybernetics and Informatics (SCI'01). (Software Quality: Standards, Metrics, Models, Tools and Human Aspects Session).Google Scholar
- Rilling, J. and Mudur, S. P. 2002. On the use of metaballs to visually map source code structures and analysis results onto 3d space. In Proceedings of the 10th Working Conference on Reverse Engineering (Richmond, Virginia). IEEE Computer Society Press, Los Alamitos, CA, 42--52. Google Scholar
Digital Library
- Ryan, C. 2000. Automatic re-engineering of software using genetic programming. Kluwer Academic Publishers. Google Scholar
Digital Library
- Sherriff, M. and Williams, L. 2008. Empirical software change impact analysis using singular value decomposition. In Proceedings of the 1st IEEE International Conference on Software Testing. IEEE Computer Society Press, Los Alamitos, CA, 268--277. Google Scholar
Digital Library
- Tip, F. 1995. A survey of program slicing techniques. J. Prog. Lang. 3, 3 (Sept.), 121--189.Google Scholar
- Tonella, P. 2003. Using a concept lattice of decomposition slices for program understanding and impact analysis. IEEE Trans. Softw. Eng. 29, 6, 495--509. Google Scholar
Digital Library
- Tracey, N., Clark, J., Mander, K., and McDermid, J. 2000. Automated test-data generation for exception conditions. Softw. Pract. Exper. 30, 1, 61--79. Google Scholar
Digital Library
- Voas, J. M. and Miller, K. W. 1995. Software testability: The new verification. IEEE Software 12, 3 (May), 17--28. Google Scholar
Digital Library
- Weiser, M. 1982. Programmers use slices when debugging. Commun. ACM 25, 7 (July), 446--452. Google Scholar
Digital Library
- Wheeler, D. A. 2005. SLOC count user's guide. http://www.dwheeler.com/sloccount/sloccount.html.Google Scholar
- Yau, S. S. and Collofello, J. S. 1985. Design stability measures for software maintenance. IEEE Trans. Softw. Eng. 11, 9 (Sept.), 849--856. Google Scholar
Digital Library
- Zhao, J. 2002. Slicing aspect-oriented software. In Proceedings of the 10th IEEE International Workshop on Program Comprehension (Paris, France). IEEE Computer Society Press, Los Alamitos, CA, 351--260. Google Scholar
Digital Library
Index Terms
Dependence clusters in source code
Recommendations
Coherent dependence clusters
PASTE '10: Proceedings of the 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineeringLarge clusters of mutual dependence can cause problems for comprehension, testing and maintenance. This paper introduces the concept of coherent dependence clusters, techniques for their efficient identification, visualizations to better understand them,...
Dependence cluster visualization
SOFTVIS '10: Proceedings of the 5th international symposium on Software visualizationLarge clusters of mutual dependence have long been regarded as a problem impeding comprehension, testing, maintenance, and reverse engineering. An effective visualization can aid an engineer in addressing the presence of large clusters. Such a ...
Coherent clusters in source code
HighlightsIntroduction of efficient clustering algorithm.Empirical analysis to assess the frequency and size of coherent clusters.A series of case studies showing how clusters identify logical program structures.A study on the relationship between ...








Comments