Abstract
We address the issue of measuring distribution fairness in Internet-scale networks. This problem has several interesting instances encountered in different applications, ranging from assessing the distribution of load between network nodes for load balancing purposes, to measuring node utilization for optimal resource exploitation, and to guiding autonomous decisions of nodes in networks built with market-based economic principles. Although some metrics have been proposed, particularly for assessing load balancing algorithms, they fall short. We first study the appropriateness of various known and previously proposed statistical metrics for measuring distribution fairness. We put forward a number of required characteristics for appropriate metrics. We propose and comparatively study the appropriateness of the Gini coefficient (G) for this task. Our study reveals as most appropriate the metrics of G, the fairness index (FI), and the coefficient of variation (CV) in this order. Second, we develop six distributed sampling algorithms to estimate metrics online efficiently, accurately, and scalably. One of these algorithms (2-PRWS) is based on two effective optimizations of a basic algorithm, and the other two (the sequential sampling algorithm, LBS-HL, and the clustered sampling one, EBSS) are novel, developed especially to estimate G. Third, we show how these metrics, and especially G, can be readily utilized online by higher-level algorithms, which can now know when to best intervene to correct unfair distributions (in particular, load imbalances). We conclude with a comprehensive experimentation which comparatively evaluates both the various proposed estimation algorithms and the three most appropriate metrics (G, CV, andFI). Specifically, the evaluation quantifies the efficiency (in terms of number of the messages and a latency indicator), precision, and accuracy achieved by the proposed algorithms when estimating the competing fairness metrics. The central conclusion is that the proposed metric, G, can be estimated with a small number of messages and latency, regardless of the skew of the underlying distribution.
- Adamic, L. 2000. Zipf, power-laws, and Pareto—A ranking tutorial. http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html.Google Scholar
- Alon, N., Duffield, N., Lund, C., and Thorup, M. 2005. Estimating arbitrary subset sums with few probes. In Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS'05). ACM Press, New York, 317--325. Google Scholar
Digital Library
- Aspnes, J., Kirsch, J., and Krishnamurthy, A. 2004. Load balancing and locality in range-queriable data structures. In Proceedings of the 23rd Annual ACM Symposium on Principles of Distributed Computing (PODC'04). ACM Press, New York, 115--124. Google Scholar
Digital Library
- Awan, A., Ferreira, R., Jagannathan, S., and Grama, A. 2006a. Unstructured peer-to-peer networks for sharing processor cycles. Parall. Comput. 32, 2, 115--135. Google Scholar
Digital Library
- Awan, A., Ferreira, R., Jagannathan, S., and Grama, A. 2006b. Distributed uniform sampling in unstructured peer-to-peer networks. In Proceedings of the 39th Hawaii International Conference on System Sciences (HICSS'06). IEEE Computer Society, 223c--223c. Google Scholar
Digital Library
- Balazinska, M., Balakrishnan, H., and Stonebraker, M. 2004. Contract-Based load management in federated distributed systems. In Proceedings of the 1st Symposium on Networked Systems Design and Implementation (NSDI'04). USENIX, 15--15. Google Scholar
Digital Library
- Bharambe, A., Agrawal, M., and Seshan, S. 2004. Mercury: Supporting scalable multi-attribute range queries. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM'04). 353--366. Google Scholar
Digital Library
- Bienkowski, M., Korzeniowski M., and Meyer Auf Der Heide, F. 2005. Dynamic load balancing in distributed hash tables. In Proceedings of the 4th International Workshop on Peer-to-Peer Systems (IPTPS'05). Lecture Notes in Computer Science, vol. 3640. Springer, 217--225. Google Scholar
Digital Library
- Breslau, L., Cao, P., Fan, L., Phillips, G., and Shenker, S. 1999. Web caching and Zipf-like distributions: Evidence and implications. In Proceedings of the Conference on Computer Communication (IEEE INFOCOM'99). 126--134.Google Scholar
- Buyya, R., Abramson, D., Giddy, J., and Stockinger, H. 2002. Economic models for resource management and scheduling in grid computing. In Concurrency and Computation: Practice and Experience 14. John Wiley&Sons, 1507--1542.Google Scholar
- Byers, J. W., Considine, J., and Mitzenmacher, M. 2003. Simple load balancing for distributed hash tables. In Proceedings of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS'03). Lecture Notes in Computer Science, vol. 2735. Springer, 80--87.Google Scholar
- Cochran, W. 1977. Sampling Techniques, 3rd ed., John Wiley&Sons.Google Scholar
- Damgaard, C. and weiner, J. 2000. Describing inequality in plant size or fecundity. Ecology 81, 1139--1142.Google Scholar
Cross Ref
- Datta, S. and Kargupta, H. 2007. Uniform data sampling for unstructured peer-to-peer networks. In Proceedings of the 27th International Conference on Distributed Computing Systems (ICDCS'07). IEEE Computer Society, 50. Google Scholar
Digital Library
- Drougas, Y. and Kalogeraki, V. 2005. A fair resource allocation algorithm for peer-to-peer overlays. In Proceedings of the 24th Conference on Computer Communications (IEEE InfoCom'05). IEEE Computer Society, 2853--2858.Google Scholar
- Ganesan, P., Bawa, M., and Garcia-Molina, H. 2004. Online balancing of range-partitioned data with applications to peer-to-peer systems. In Proceedings of the 30th International Conference on Very Large Databases (VLDB'04). M. Nascimento et al., Eds. Morgan Kaufmann, 444--455. Google Scholar
Digital Library
- Gini, C. 1912. Variabilita e mutabilita. Reprinted in Memorie di Metodologia Statistica, E. Pizetti and T. Salvemini, Eds. Libreria Erendi Virgilio Veschi, Rome.Google Scholar
- Gkantsidis, C., Mihail, M., and Saberi, A. 2004. Random walks in peer-to-peer networks. In Proceedings of the 23rd Conference on Computer Communications (IEEE InfoCom'04). IEEE Computer Society.Google Scholar
- Gnutella. 2003. The Gnutella web site. http://wiki.limewire.org/index.php?title=GDF.Google Scholar
- Godfrey, B., Lakshminarayanan, K., Surana, S., Karp, R., and Stoica, I. 2004. Load balancing in dynamic structured P2P systems. In Proceedings of the 23rd Conference on Computer Communications (IEEE InfoCom'04). IEEE Computer Society, 2253--2262.Google Scholar
- Godfrey, B. and Stoica, I. 2005. Heterogeneity and load balance in distributed hash tables. In Proceedings of the 24th Conference on Computer Communications (IEEE InfoCom'05). IEEE Computer Society, 596--606.Google Scholar
- Gopalakrishnan, V., Silaghi, B., Bhattacharjee, B., and Keleher, P. 2004. Adaptive replication in peer-to-peer systems. In Proceedings of the 24th Conference on Distributed Computing Systems (ICDCS'04). IEEE Computer Society, 360--369. Google Scholar
Digital Library
- Jain, R., Chiu, D. M., and Hawe, W. 1984. A quantitative measure of fairness and discrimination for resource allocation in shared computer systems. Tech. rep. DEC-TR-301, Digital Institution Corporation, Hudson, MA.Google Scholar
- Jelasity, M., Voulgaris, S., Guerraoui, R., Kermarrec, A., and van Steen, M. 2007. Gossip-based peer sampling. ACM Trans. Comput. Syst. 25, 3, 8. Google Scholar
Digital Library
- Karger, D. R., and Ruhl, M. 2004. Simple efficient load balancing algorithms for peer-to-peer systems. In Proceedings of the 16th Annual ACM Symposium on Parallelism in Algorithms (SPAA'04). 36--43. Google Scholar
Digital Library
- Kazaa. 2003. The Kazaa web site. http://www.kazaa.com.Google Scholar
- King, V. and Saia, J. 2004. Choosing a random peer. In Proceedings of the 23rd Annual ACM Symposium on Principles of Distributed Computing (PODC'04). ACM Press, New York, 125--130. Google Scholar
Digital Library
- King, V., Lewis, S., Saia, J., and Young, M. 2007. Choosing a random peer in chord. Algorithmica 49, 2, 147--169. Google Scholar
Digital Library
- Ledlie, J. and Seltzer, M. 2005. Distributed, secure load balancing with skew, heterogeneity, and churn. In Proceedings of the 24th Conference on Computer Communications (IEEE InfoCom'05). IEEE Computer Society, 1419--1430.Google Scholar
- Ma, R., Lee, S., Lui, J., and Yau, D. 2004. A game theoretic approach to provide incentive and service differentiation in P2P networks. In Proceedings of the Joint International Conference on Measurement and Modeling of Computers (SIGMETRICS—Performance'04). ACM Press, New York, 189--198. Google Scholar
Digital Library
- Mondal, A., Goda, K., and Kitsuregawa, M. 2003. Effective load-balancing via migration and replication in spatial grids. In Proceedings of the 14th International Conference on Database and Expert Systems Applications (DEXA'03). V. Marik, et al., Eds. Lecture Notes in Computer Science, vol. 2736. Springer, 202--211.Google Scholar
- Mojonation. 2003. The MojoNation web site. http://en.wikipedia.org/wiki/Mnet#MojoNation.Google Scholar
- Naor, M. and Wieder, U. 2003. Novel architectures for P2P applications: The continuous-discrete approach. In Proceedings of the 15th Annual ACM Symposium on Parallel Algorithms and Architectures (ACM SPAA'03). ACM Press, New York, 50--59. Google Scholar
Digital Library
- Padmanabhan, V., and Qiu, L. 2000. The content and access dynamic of a busy web site: Findings and implications. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM'00). 111--123. Google Scholar
Digital Library
- Papoulis, A. 1991. Probability, Random Variables and Stochastic Processes, 3rd ed. McGraw Hill, New York.Google Scholar
- Pitoura, T., Ntarmos, N., and Triantafillou, P. 2006. Replication, load balancing and efficient range query processing in DHTs. In Proceedings of the 10th International Conference on Extending Database Technology (EDBT'06). Y. Ioannidis et al., Eds. Lecture Nodes in Computer Science, vol. 3896. Springer, 131--148. Google Scholar
Digital Library
- Pitoura, T. and Triantafillou, P. 2007. Load distribution fairness in P2P data management systems. In Proceedings of the 23rd International Conference on Data Engineering (ICDE'07). IEEE Computer Society, 396--405.Google Scholar
- Rao, A., Lakshminarayanan, K., Surana, S., Karp, R., and Stoica, I. 2003. Load balancing in structured P2P systems. In Proceedings of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS'03). Lecture Notes in Computer Science, vol. 2735. Springer 68--79.Google Scholar
- Ratnasamy, S., Francis, P., Handley, K., Karp, R., and Shenker, S. 2001. A scalable content-addressable network. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM'01). 161--172. Google Scholar
Digital Library
- Rowstron, A. and Druschel, P. 2001. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms (Middleware'01). Lecture Notes in Computer Science, vol. 2218. Springer, 329--350. Google Scholar
Digital Library
- Saroiu, S., Gummadi, P., and Gribble, S. 2002. A measurement study of peer-to-peer file sharing systems. In Proceedings of Multimedia Computing and Networking Conference (MMCN'02).Google Scholar
- Serbu, S., Bianchi, S., Kropf, P., and Felber, P. 2007. Dynamic load sharing in peer-to-peer systems: When some peers are more equal than others. IEEE Internet Comput. 11, 4, 53--61. Google Scholar
Digital Library
- Shorrocks, A. 1984. Inequality decomposition by population subgroup. Econometrica 52, 6, 1369--85.Google Scholar
Cross Ref
- Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. 2001. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM'01). 149--160. Google Scholar
Digital Library
- Stutzbach, D., Rejaie, R., Duffield, N., Sen, S., and Willinger, W. 2006. On unbiased sampling for unstructured peer-to-peer networks. In Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement (IMC'06). 27--40. Google Scholar
Digital Library
- Surana, S., Godfrey, B., Lakshminarayanan, K., Karp, R., and Stoica, I. 2006. Load balancing in dynamic structured peer-to-peer systems. In Perform. Eval. Special Issue on Performance Modeling and Evaluation of Peer-to-Peer Computing Systems. 63, 6, 217--240. Google Scholar
Digital Library
- Swart, G. 2004. Spreading the load using consistent hashing: A preliminary report, In Proceedings of the 3rd International Symposium on Parallel and Distributed Computing/3rd International Workshop on Algorithms, Models, and Tools for Parallel Computing on Heterogeneous Networks (ISPDC/HeteroPar'04). IEEE Computer Society, 169--176. Google Scholar
Digital Library
- Triantafillou, P, Xiruhaki, C., Koubarakis, M, and Ntarmos, N. 2003. Towards high performance peer-to-peer content and resource sharing systems. In Proceedings of the 1st Biennial Conference on Innovative Data Systems Research (CIDR'03).Google Scholar
- Vishnumurthy, V. and Francis, P. 2006. On heterogeneous overlay construction and random node selection in unstructured P2P networks. In Proceedings of the 25th Conference on Computer Communications (IEEE InfoCom'06). IEEE Computer Society.Google Scholar
- Wang, X., Zhang, Y., Li, X., and Loguinov, D. 2004. On zone-balancing of peer to peer networks: Analysis of random node join. In Proceedings of the Joint Internatioal Conference on Measurement and Modeling of Computers (SIGMETRICS-Performance'04). ACM Press, New York, 211--222. Google Scholar
Digital Library
- Yang, X., and De Veciana, G. 2006. Performance of peer-to-peer networks: Service capacity and role of resource sharing policies. Perform. Eval. 63, 175--194. Google Scholar
Digital Library
- Zhao, Y. B., Huang, L., Stribling, J., Rhea, S. C., Joseph, A. D., and Kubiatowicz, J. 2004. Tapestry: An resilient global-scale overlay for service deployment. IEEE J. Select. Areas Comm. 22, 1, 41--53. Google Scholar
Digital Library
- Zhu, Y., and Hu, Y. 2005. Efficient, proximity-aware load balancing for DHT-based P2P systems. IEEE Trans. Parall. Distrib. Syst. 16, 4, 349--361. Google Scholar
Digital Library
- Zipf, G. 1935. The Psycho-Biology of Language. Houghton Mifflin, Boston.Google Scholar
Index Terms
Distribution fairness in Internet-scale networks
Recommendations
Cost-effective broadcast for fully decentralized peer-to-peer networks
Fully unstructured and decentralized peer-to-peer networks such as Gnutella are appealing for a variety of applications, among which file-sharing is the most prominent one. The decentralized nature of these systems provides a high degree of robustness ...
A fairness model for resource allocation in wireless networks
IFIP'12: Proceedings of the 2012 international conference on NetworkingIn wireless networks many nodes contend for available resources creating a challenge in resource allocation. With shared resources, fairness in allocation is a serious issue. Fairness metrics have been defined to measure the fairness level of resource ...
A mechanism for resource pricing and fairness in peer-to-peer networks
In peer-to-peer (P2P) networks, each peer acts as the role of client and server. As a client, each peer is regarded as a service customer. It sends requests to other peers to download files and obtains resource allocation from them. As a server, each ...






Comments