A Preliminary Analysis of Software Metrics in Decentralised Applications

This study examines software metrics in decentralized applications (dApps) to analyze their structural and behavioral characteristics as they grow in complexity. Sixty dApps were categorized into Small (3 to 29 contracts), Medium (30 to 46 contracts), and Large (47 to 206 contracts) based on their contract count. Initial analysis showed a non-normal data distribution, leading to the use of Spearman's correlation method. Findings revealed that Medium dApps have strong correlations between metrics like 'Average Local Variables' and 'Maximum Local Variables', while Large dApps show higher correlations between 'Number of Functions' and 'State Variable Count', indicating more complex contract structures. The higher Coupling Between Objects (CBO) in large dApps suggests increased interactions with other contracts or libraries, potentially elevating security risks. These insights are valuable for developers and stakeholders in the blockchain and IoT sectors, aiding in understanding how dApps evolve with increasing complexity and the implications on software metric relationships.


INTRODUCTION
Blockchain technology introduces new possibilities for improving trust and privacy in networked sensor systems across different areas.Decentralised applications (dApps) play a key role in this, providing a solid platform for developing solutions that effectively manage sensor data while ensuring data protection and incentivized sharing.At the core of dApps are smart contracts-self-executing contracts with the terms directly written into code, enabling automated transactions on the blockchain.The growing number of dApps, especially within the Internet of Things (IoT) and smart city domains, emphasizes the importance of understanding their software metrics to improve development, ensure security, and enhance performance.
This study provides an initial analysis of the software metrics of dApps, aiming to explore how these metrics vary with the size and complexity of the dApps.By categorizing 60 dApps (selected from the DAppScan repository1 ) [12] into three distinct groups based on their contract count-Small, Medium, and Large, we aim to analyze the distribution and correlation of software metrics at both contract and function levels.Our analysis examines whether and how the distributions of these metrics, and the correlations among them, change with the dApp size.This preliminary analysis contributes to the understanding around dApps' structure and behavior, setting the stage for more in-depth future studies.

RELATED WORK
Several studies have previously explored software metrics within traditional applications to understand their structure and behavior better [4,5].Focusing on object-oriented (OO) software, one of the pioneering efforts to address this concern is credited to Chidamber and Kemerer (CK), who proposed the widely recognized CK metrics suite for OO software systems [1].Numerous empirical studies have since underscored significant correlations between certain CK metrics and bug-proneness [3,7,10].Metrics defined on software graphs have also been explored, with findings correlating them to software quality [13].Transitioning to the blockchain domain, in a recent work, Ibba et al. [8] developed a tool for employing complex Networks Analysis on dApps [2], in order to help with the identification of vulnerability and code optimisation.
Ortu et al. [9] compared Blockchain-Oriented Software (BOS) and traditional software using 10 metrics, finding significant differences in the distribution of Average Cyclomatic and Ratio Comment To Code metrics, and the Number of Statements metric.
Tonelli et al. [11] analyzed 85,000 Smart Contracts on the Ethereum blockchain to understand how their constraints are reflected in specific software metrics compared to traditional software.Findings showed that while Smart Contracts exhibit more restricted metric ranges, their lines of code follow a log-normal distribution akin to traditional software, hinting at some shared characteristics despite the unique constraints of blockchain environments.
Our study extends the investigation to decentralized applications on the Ethereum blockchain, categorizing them based on their contract count to explore the distribution and correlation of software metrics across different complexity levels.This analysis not only sheds light on how dApp size and complexity interact with various software metrics but also sets a foundation for future in-depth studies aimed at understanding dApps' evolution and potential optimization strategies.

METHODOLOGY
The metrics for this analysis were collected through a combination of our in-house analysis tools and Slither [6], a well-known static analysis framework for smart contracts.In analyzing the software metrics of decentralized applications (dApps), it is fundamental to select an appropriate method that accurately reflects the underlying relationships among the metrics.An initial assessment of the data revealed a non-normal distribution, which is a common occurrence in real-world data, especially in a relatively new and rapidly evolving domain like blockchain.Traditional correlation methods such as Pearson's correlation assume a linear relationship and a normal distribution of data, which could lead to misleading conclusions in our case.
Spearman's correlation, on the other hand, does not make any assumptions about the distribution of the data and measures the strength and direction of the monotonic relationship between variables.It evaluates the rank-order relationship between two variables, making it a more robust choice for this analysis.This method is well-suited for our dataset, allowing for a more accurate exploration of correlations between software metrics across different sizes and complexities of dApps.
We proceeded to categorize the dApps into distinct groups based on their structural complexity, as represented by the number of contracts they contain.This categorization is fundamental to our analysis as it provides a systematic approach to understanding how the structural and behavioral characteristics of dApps vary with size and complexity, laying a solid foundation for the subsequent analysis.
The categorization of dApps into Small, Medium, and Large groups based on the number of contracts they contain is a practical approach derived from the characteristics of our dataset.The specific ranges (3 to 29, 30 to 46, 47 to 206) for these categories were selected to create a balanced division that allows for meaningful comparison and analysis across groups.However, this division is arbitrary and represents a significant limitation of the study.It is dictated by the distribution and the nature of the dApps available in our dataset rather than a universally accepted standard.This categorization helped in understanding the variation in software metrics among dApps of different sizes and complexities in the context of our dataset, though the defined ranges may not hold or be relevant for a different dataset or in a broader context.Future studies may benefit from a more standardized or universally accepted method of categorization, or by exploring alternative methods that might provide a more nuanced understanding of dApp complexity and size.
We categorized these applications based on their structural complexity, specifically focusing on the number of contracts they contain.This approach allows us to capture the nuanced differences in DApps, which can range from simple prototypes to highly complex ecosystems.
• Small DApps: This category includes DApps with a number of contracts ranging from 3 to 29.They are often simpler, either being in the prototype stage or targeting very specific use-cases.For example, the DApp "Async" has just one file, three contracts, and eight functions.• Medium DApps: DApps falling into this category have a number of contracts ranging from 30 to 46.These DApps are more complex than those categorized as "Small" but not as intricate as the "Large" DApps.They often address broader use-cases and incorporate more complex functionalities.An example would be "AliumSwap" with 24 files, 30 contracts, and 240 functions.• Large DApps: These are highly complex DApps that contain a number of contracts ranging from 47 to 206.They often serve diverse functions and may be part of a larger blockchain ecosystem.For instance, "Loopring" has 200 files, 206 contracts, and 1591 functions.
The primary metric driving this categorization is the Number of Contracts.It offers a quantitative measure of a DApp's complexity and potentially its functional diversity.By organizing the DApps according to these categories, this study aims for a systematic and structured approach to understanding how size and complexity relate to other structural and security metrics.This approach helps identify specific trends or patterns that may be unique to DApps of certain sizes, thereby adding depth and granularity to the study's findings.The first plot shows the distribution of the number of contracts.The second plot shows the distribution of the number of functions.In both plots, the central line in each box indicates the median of the data, while the top and bottom edges of the box show the interquartile range.The "whiskers" extend to 1.5 times the interquartile range, and any data points beyond that are considered outliers.
Before studying contract and function level metrics, it is fundamental to understand the distribution type for each metric.To assess the distribution type, histograms were plotted for each metric, both at the contract and function levels.These histograms were generated separately for Small (Fig. 2), Medium (Fig. 3), and Large dApps (Fig. 4) to observe any category-specific trends.Kernel density estimates were also plotted to provide a smooth, continuous representation of the data distribution.The histograms revealed that most metrics, irrespective of dApp category, exhibited a rightskewed distribution.Based on this observation, non-parametric correlation measures like Spearman's were selected for the subsequent correlation analysis.
To prepare for correlation analysis, the normality of metric distributions was assessed using both Shapiro-Wilk and Kolmogorov-Smirnov tests, confirming the non-normal nature of the data.This led to the selection of non-parametric correlation methods, namely Kendall's and Spearman's, to analyze the relationships between different metrics.The findings from these correlation analyses offered insights into the interplay between various contract and function attributes, providing a nuanced understanding of dApp characteristics across different sizes and complexities.

RESULTS
The following contract-related metrics were considered for this analysis: • Inheritance Depth: Measures how many layers of inheritance a contract has.A higher depth could indicate a more complex contract structure.Most Small and Medium dApps tend to have a lower inheritance depth compared to Large dApps, which often employ multiple layers of inheritance for added functionality and modularity.• CBO (Coupling Between Objects): Indicates the number of other contracts or libraries that a contract interacts with.Higher coupling may lead to increased complexity and potential risks.The Coupling Between Objects (Contracts in our  • Max Local Variables: The maximum number of local variables used in any single function within a contract.The metrics show occasional spikes in the number of maximum local variables in functions, especially in Large dApps, suggesting some functions may be doing more complex computations.• No. of Functions: The total number of functions in a contract.This metric gives an idea of the contract's functionality and complexity.Observations Large dApps clearly have a higher number of functions, providing more services or features.Small dApps, in contrast, are simpler and offer fewer functionalities.

Function level metrics
The following discuss the Analysis of Function-Level Metrics Across Dapp Categories to further investigate the complexities of decentralized applications (dApps), the study also focuses on function-level metrics across Small, Medium, and Large dApps.These metrics offer a granular look into how individual functions within smart contracts are designed and implemented.The function-level metrics analyzed are: • No. of Parameters: Indicates the number of parameters a function takes.A higher number could make the function more complex and harder to use.Functions in Small dApps tend to have fewer parameters, implying simpler interfaces.In contrast, Medium and Large dApps often have functions with more parameters, allowing for more complex interactions.

Evaluating the Metrics Distribution
Before proceeding with the correlation analysis of contract and function-level metrics across different categories of decentralized applications, it is crucial to understand the distribution type for each metric.This initial step is important as the type of distribution can significantly influence the choice of correlation measure used.
For instance, Pearson's correlation is most effective when the data is normally distributed, but may produce misleading results if the data is skewed or contains outliers.On the other hand, non-parametric measures like Spearman's and Kendall's correlation are more robust against such irregularities.The Shapiro-Wilk test, a widely-accepted statistical test for normality, was employed on each of the contract and function-level metrics, segregated by the dApp categories: Small, Medium, and Large.The test outputs a p-value, where a value less than 0.05 typically suggests that the data does not follow a normal distribution.The p-values for all metrics across all categories were significantly less than 0.05, with function-level metrics even yielding a p-value of zero.These findings indicate that the metric distributions are not normal, thereby making a compelling case for the use of nonparametric correlation measures like Spearman's or Kendall's for the analysis.
As an additional layer of robustness to the normality assessment, the Kolmogorov-Smirnov (KS) test was executed on each of the contract and function-level metrics, categorized by the size of the dApps: Small, Medium, and Large.The KS test compares the empirical distribution function of the sample data with the cumulative distribution function of a specified theoretical distribution-in this case, the normal distribution.The p-values obtained for all metrics across each category were essentially zero, thereby rejecting the null hypothesis of normal distribution conclusively.These findings corroborate the results from the earlier Shapiro-Wilk test, reinforcing the decision to employ non-parametric correlation measures for the subsequent correlation analysis.

Spearman's Correlations
In examining the Spearman's Correlation matrices for contract metrics across small (Fig. 5), medium (Fig. 6), and large Dapps (Fig. 7), certain patterns emerge.For small Dapps, there are notable strong correlations between 'Inheritance Depth', 'State Variable Count', and 'Number of Functions' with values greater than 0.7.In contrast, medium Dapps display slightly diversified strong correlations, between 'Inheritance Depth' and 'State Variable Count' approaching 0.8, with 'Number of Functions' also being significantly related to these metrics.However, large Dapps exhibit a dilution in the strength of these correlations, with the strongest link being between 'State Variable Count' and 'Inheritance Depth' at around 0.78.It's interesting to note that as the Dapps grow in complexity (from small to large), the correlations between 'Avg Local Variables' and other metrics become more dispersed, suggesting that larger Dapps might have a broader variance in their contract structures.This comparative analysis provides an insight into how contract interactions and structures evolve with the size and complexity of Dapps.
In the evaluation of Spearman's correlation among function metrics within decentralized applications (dApps) of varying sizes,  distinct patterns emerge.For small dApps, there's a strong positive correlation between the number of parameters and the local variable count (0.83), suggesting that as functions increase their parameter count, they also tend to have more local variables.Medium-sized dApps show a similar trend, albeit slightly weaker (0.85).In large dApps, this correlation remains significant but decreases to 0.81.Interestingly, cyclomatic complexity exhibits a negative correlation with function calls for all dApp sizes: -0.38 for small, -0.39 for medium, and -0.42 for large.This result should be investigated further.A negative correlation between cyclomatic complexity and function calls might initially seem counterintuitive because one might expect more complex functions to have more function calls.However, this negative correlation could be related to decomposition, e.g., developers may be breaking down complex logic into smaller, more manageable functions.This would mean fewer function calls within each complex function, as the logic is spread out.Higher cyclomatic complexity often involves more branching (if, else, switch, etc.).It is possible that in more complex functions, the logic is handled through conditional structures rather than function calls.It may reflect a particular design philosophy or best practice that advises against making multiple function calls within complex functions to make the code easier to understand and maintain.
It is also noteworthy that the correlation between the number of parameters and function calls is fairly consistent across dApp sizes, ranging from 0.22 to 0.24.Overall, these findings provide insights into the evolution of function design patterns as dApps scale.
The findings from our analysis carry implications for the development, security, and performance optimization of decentralized applications (dApps).
• Development Complexity: Our analysis reveals a clear correlation between the size of dApps and certain software metrics, which reflects an increase in development complexity as dApps scale.Understanding these correlations can help developers anticipate the challenges they may face as their dApps grow, enabling better planning and resource allocation.• Security Considerations: The higher Coupling Between Objects (CBO) in large dApps suggests more interactions with other contracts or libraries, which could potentially introduce security risks.Moreover, the increased inheritance depth in larger dApps might also lead to a more complex contract structure, requiring more rigorous security auditing and testing to ensure robustness against potential threats.• Performance Optimization: The analysis of function-level metrics provides insights into how individual functions within smart contracts are designed and implemented across different dApp sizes.The correlation between the number of parameters and the local variable count, for instance, could have implications for the performance and gas costs in Ethereumbased dApps.Understanding these patterns can help developers optimize their code to ensure efficient resource utilization, especially in larger dApps with more complex structures.• Modular Design: The higher frequency of function calls in large dApps indicates a higher degree of modularity, which is essential for managing complexity in software development.This modularity might aid in isolating issues, enhancing maintainability, and promoting reusable code.
The analysis undertaken in this study aligns with an exploratory approach, aimed at uncovering initial insights and trends concerning the software metrics of decentralized applications (dApps) across varying sizes and complexities.

THREATS TO VALIDITY
The validity of this study is subject to several threats that need acknowledgment.The sample size of 60 dApps selected from the DAppScan repository is not representative of the broader spectrum of dApps, potentially introducing a selection bias if the repository lacks diversity or has a specific focus.The categorization of dApps into Small, Medium, and Large based on the number of contracts is somewhat arbitrary and derived from the dataset on hand.This categorization may not capture the true essence of complexity and size, thereby potentially oversimplifying the heterogeneity of dApps.The choice of metrics for analysis, although based on software engineering principles, may not encompass all relevant aspects of dApp complexity and functionality, and there might be other metrics not considered that could provide additional or alternative insights.The data exhibited a non-normal distribution, which led to the use of non-parametric correlation measures like Spearman's correlation.While these measures are robust against certain irregularities, they may not capture all relationships or nuances present in the data.The findings, may have limited generalisability beyond the specific set of dApps analyzed, and the rapid evolution of blockchain technology and dApp development practices may impact the relevance and applicability of the findings over time.As an exploratory study, the analysis aims to present initial insights and trends rather than confirm predefined hypotheses.The findings should be interpreted as preliminary, necessitating further confirmatory analyses to establish stronger causal or correlational relationships.Lastly, the accuracy and precision of the tools and methods used to collect and analyze the metrics can also pose a threat to validity.Any inconsistencies or errors in measurement could potentially affect the reliability and reproducibility of the findings.Through acknowledging these threats to validity, we aim to provide a transparent account of the limitations inherent in our study and lay the groundwork for further research that can build upon, validate, or refine the preliminary results presented.

CONCLUSIONS
This preliminary study analysed software metrics of dApps on the Ethereum blockchain, categorizing them into Small, Medium, and Large based on contract count.The analysis showed that as dApps scale, certain metrics such as Inheritance Depth and Coupling Between Objects (CBO) increase, indicating more complex contract structures and interactions with other contracts or libraries.Larger dApps not only have more contracts but also more complex contracts, which could potentially introduce security risks.
On the function level, a consistent relationship was found between the number of parameters and local variable count across all dApp sizes.Additionally, a negative correlation between cyclomatic complexity and function calls was observed, suggesting a possible trend of decomposing complex logic into smaller, more manageable functions in larger dApps.
These findings are important for developers and stakeholders in the blockchain and IoT sectors as they provide a clearer understanding of how dApps evolve with increasing complexity.This information can be valuable for better planning, security auditing, and performance optimization in dApp development.The results also provide a basis for more in-depth future studies on software metrics in decentralized applications.

Figure 1 :
Figure 1: Boxplots -number of contract and functions

Figure 1
Figure 1 presents the box plots showing the distributions of the number of contracts and functions for each category (Small, Medium, Large): The first plot shows the distribution of the number of contracts.The second plot shows the distribution of the number of functions.In both plots, the central line in each box indicates the median of the

Figure 4 :
Figure 4: Contract Metrics for Large DApp

Table 1 :
Summary of Small Dapps

Table 2 :
Summary of Medium Dapps

Table 3 :
Summary of Large Dapps tor of how much temporary storage a contract uses.Across all categories, the average number of local variables tends to be moderate, indicating a balance in the use of temporary storage for function computations.