skip to main content
research-article

Distribution fairness in Internet-scale networks

Published:14 October 2009Publication History
Skip Abstract Section

Abstract

We address the issue of measuring distribution fairness in Internet-scale networks. This problem has several interesting instances encountered in different applications, ranging from assessing the distribution of load between network nodes for load balancing purposes, to measuring node utilization for optimal resource exploitation, and to guiding autonomous decisions of nodes in networks built with market-based economic principles. Although some metrics have been proposed, particularly for assessing load balancing algorithms, they fall short. We first study the appropriateness of various known and previously proposed statistical metrics for measuring distribution fairness. We put forward a number of required characteristics for appropriate metrics. We propose and comparatively study the appropriateness of the Gini coefficient (G) for this task. Our study reveals as most appropriate the metrics of G, the fairness index (FI), and the coefficient of variation (CV) in this order. Second, we develop six distributed sampling algorithms to estimate metrics online efficiently, accurately, and scalably. One of these algorithms (2-PRWS) is based on two effective optimizations of a basic algorithm, and the other two (the sequential sampling algorithm, LBS-HL, and the clustered sampling one, EBSS) are novel, developed especially to estimate G. Third, we show how these metrics, and especially G, can be readily utilized online by higher-level algorithms, which can now know when to best intervene to correct unfair distributions (in particular, load imbalances). We conclude with a comprehensive experimentation which comparatively evaluates both the various proposed estimation algorithms and the three most appropriate metrics (G, CV, andFI). Specifically, the evaluation quantifies the efficiency (in terms of number of the messages and a latency indicator), precision, and accuracy achieved by the proposed algorithms when estimating the competing fairness metrics. The central conclusion is that the proposed metric, G, can be estimated with a small number of messages and latency, regardless of the skew of the underlying distribution.

References

  1. Adamic, L. 2000. Zipf, power-laws, and Pareto—A ranking tutorial. http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html.Google ScholarGoogle Scholar
  2. Alon, N., Duffield, N., Lund, C., and Thorup, M. 2005. Estimating arbitrary subset sums with few probes. In Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS'05). ACM Press, New York, 317--325. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aspnes, J., Kirsch, J., and Krishnamurthy, A. 2004. Load balancing and locality in range-queriable data structures. In Proceedings of the 23rd Annual ACM Symposium on Principles of Distributed Computing (PODC'04). ACM Press, New York, 115--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Awan, A., Ferreira, R., Jagannathan, S., and Grama, A. 2006a. Unstructured peer-to-peer networks for sharing processor cycles. Parall. Comput. 32, 2, 115--135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Awan, A., Ferreira, R., Jagannathan, S., and Grama, A. 2006b. Distributed uniform sampling in unstructured peer-to-peer networks. In Proceedings of the 39th Hawaii International Conference on System Sciences (HICSS'06). IEEE Computer Society, 223c--223c. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Balazinska, M., Balakrishnan, H., and Stonebraker, M. 2004. Contract-Based load management in federated distributed systems. In Proceedings of the 1st Symposium on Networked Systems Design and Implementation (NSDI'04). USENIX, 15--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bharambe, A., Agrawal, M., and Seshan, S. 2004. Mercury: Supporting scalable multi-attribute range queries. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM'04). 353--366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bienkowski, M., Korzeniowski M., and Meyer Auf Der Heide, F. 2005. Dynamic load balancing in distributed hash tables. In Proceedings of the 4th International Workshop on Peer-to-Peer Systems (IPTPS'05). Lecture Notes in Computer Science, vol. 3640. Springer, 217--225. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Breslau, L., Cao, P., Fan, L., Phillips, G., and Shenker, S. 1999. Web caching and Zipf-like distributions: Evidence and implications. In Proceedings of the Conference on Computer Communication (IEEE INFOCOM'99). 126--134.Google ScholarGoogle Scholar
  10. Buyya, R., Abramson, D., Giddy, J., and Stockinger, H. 2002. Economic models for resource management and scheduling in grid computing. In Concurrency and Computation: Practice and Experience 14. John Wiley&Sons, 1507--1542.Google ScholarGoogle Scholar
  11. Byers, J. W., Considine, J., and Mitzenmacher, M. 2003. Simple load balancing for distributed hash tables. In Proceedings of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS'03). Lecture Notes in Computer Science, vol. 2735. Springer, 80--87.Google ScholarGoogle Scholar
  12. Cochran, W. 1977. Sampling Techniques, 3rd ed., John Wiley&Sons.Google ScholarGoogle Scholar
  13. Damgaard, C. and weiner, J. 2000. Describing inequality in plant size or fecundity. Ecology 81, 1139--1142.Google ScholarGoogle ScholarCross RefCross Ref
  14. Datta, S. and Kargupta, H. 2007. Uniform data sampling for unstructured peer-to-peer networks. In Proceedings of the 27th International Conference on Distributed Computing Systems (ICDCS'07). IEEE Computer Society, 50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Drougas, Y. and Kalogeraki, V. 2005. A fair resource allocation algorithm for peer-to-peer overlays. In Proceedings of the 24th Conference on Computer Communications (IEEE InfoCom'05). IEEE Computer Society, 2853--2858.Google ScholarGoogle Scholar
  16. Ganesan, P., Bawa, M., and Garcia-Molina, H. 2004. Online balancing of range-partitioned data with applications to peer-to-peer systems. In Proceedings of the 30th International Conference on Very Large Databases (VLDB'04). M. Nascimento et al., Eds. Morgan Kaufmann, 444--455. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Gini, C. 1912. Variabilita e mutabilita. Reprinted in Memorie di Metodologia Statistica, E. Pizetti and T. Salvemini, Eds. Libreria Erendi Virgilio Veschi, Rome.Google ScholarGoogle Scholar
  18. Gkantsidis, C., Mihail, M., and Saberi, A. 2004. Random walks in peer-to-peer networks. In Proceedings of the 23rd Conference on Computer Communications (IEEE InfoCom'04). IEEE Computer Society.Google ScholarGoogle Scholar
  19. Gnutella. 2003. The Gnutella web site. http://wiki.limewire.org/index.php?title=GDF.Google ScholarGoogle Scholar
  20. Godfrey, B., Lakshminarayanan, K., Surana, S., Karp, R., and Stoica, I. 2004. Load balancing in dynamic structured P2P systems. In Proceedings of the 23rd Conference on Computer Communications (IEEE InfoCom'04). IEEE Computer Society, 2253--2262.Google ScholarGoogle Scholar
  21. Godfrey, B. and Stoica, I. 2005. Heterogeneity and load balance in distributed hash tables. In Proceedings of the 24th Conference on Computer Communications (IEEE InfoCom'05). IEEE Computer Society, 596--606.Google ScholarGoogle Scholar
  22. Gopalakrishnan, V., Silaghi, B., Bhattacharjee, B., and Keleher, P. 2004. Adaptive replication in peer-to-peer systems. In Proceedings of the 24th Conference on Distributed Computing Systems (ICDCS'04). IEEE Computer Society, 360--369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jain, R., Chiu, D. M., and Hawe, W. 1984. A quantitative measure of fairness and discrimination for resource allocation in shared computer systems. Tech. rep. DEC-TR-301, Digital Institution Corporation, Hudson, MA.Google ScholarGoogle Scholar
  24. Jelasity, M., Voulgaris, S., Guerraoui, R., Kermarrec, A., and van Steen, M. 2007. Gossip-based peer sampling. ACM Trans. Comput. Syst. 25, 3, 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Karger, D. R., and Ruhl, M. 2004. Simple efficient load balancing algorithms for peer-to-peer systems. In Proceedings of the 16th Annual ACM Symposium on Parallelism in Algorithms (SPAA'04). 36--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Kazaa. 2003. The Kazaa web site. http://www.kazaa.com.Google ScholarGoogle Scholar
  27. King, V. and Saia, J. 2004. Choosing a random peer. In Proceedings of the 23rd Annual ACM Symposium on Principles of Distributed Computing (PODC'04). ACM Press, New York, 125--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. King, V., Lewis, S., Saia, J., and Young, M. 2007. Choosing a random peer in chord. Algorithmica 49, 2, 147--169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ledlie, J. and Seltzer, M. 2005. Distributed, secure load balancing with skew, heterogeneity, and churn. In Proceedings of the 24th Conference on Computer Communications (IEEE InfoCom'05). IEEE Computer Society, 1419--1430.Google ScholarGoogle Scholar
  30. Ma, R., Lee, S., Lui, J., and Yau, D. 2004. A game theoretic approach to provide incentive and service differentiation in P2P networks. In Proceedings of the Joint International Conference on Measurement and Modeling of Computers (SIGMETRICS—Performance'04). ACM Press, New York, 189--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Mondal, A., Goda, K., and Kitsuregawa, M. 2003. Effective load-balancing via migration and replication in spatial grids. In Proceedings of the 14th International Conference on Database and Expert Systems Applications (DEXA'03). V. Marik, et al., Eds. Lecture Notes in Computer Science, vol. 2736. Springer, 202--211.Google ScholarGoogle Scholar
  32. Mojonation. 2003. The MojoNation web site. http://en.wikipedia.org/wiki/Mnet#MojoNation.Google ScholarGoogle Scholar
  33. Naor, M. and Wieder, U. 2003. Novel architectures for P2P applications: The continuous-discrete approach. In Proceedings of the 15th Annual ACM Symposium on Parallel Algorithms and Architectures (ACM SPAA'03). ACM Press, New York, 50--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Padmanabhan, V., and Qiu, L. 2000. The content and access dynamic of a busy web site: Findings and implications. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM'00). 111--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Papoulis, A. 1991. Probability, Random Variables and Stochastic Processes, 3rd ed. McGraw Hill, New York.Google ScholarGoogle Scholar
  36. Pitoura, T., Ntarmos, N., and Triantafillou, P. 2006. Replication, load balancing and efficient range query processing in DHTs. In Proceedings of the 10th International Conference on Extending Database Technology (EDBT'06). Y. Ioannidis et al., Eds. Lecture Nodes in Computer Science, vol. 3896. Springer, 131--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Pitoura, T. and Triantafillou, P. 2007. Load distribution fairness in P2P data management systems. In Proceedings of the 23rd International Conference on Data Engineering (ICDE'07). IEEE Computer Society, 396--405.Google ScholarGoogle Scholar
  38. Rao, A., Lakshminarayanan, K., Surana, S., Karp, R., and Stoica, I. 2003. Load balancing in structured P2P systems. In Proceedings of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS'03). Lecture Notes in Computer Science, vol. 2735. Springer 68--79.Google ScholarGoogle Scholar
  39. Ratnasamy, S., Francis, P., Handley, K., Karp, R., and Shenker, S. 2001. A scalable content-addressable network. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM'01). 161--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Rowstron, A. and Druschel, P. 2001. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms (Middleware'01). Lecture Notes in Computer Science, vol. 2218. Springer, 329--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Saroiu, S., Gummadi, P., and Gribble, S. 2002. A measurement study of peer-to-peer file sharing systems. In Proceedings of Multimedia Computing and Networking Conference (MMCN'02).Google ScholarGoogle Scholar
  42. Serbu, S., Bianchi, S., Kropf, P., and Felber, P. 2007. Dynamic load sharing in peer-to-peer systems: When some peers are more equal than others. IEEE Internet Comput. 11, 4, 53--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Shorrocks, A. 1984. Inequality decomposition by population subgroup. Econometrica 52, 6, 1369--85.Google ScholarGoogle ScholarCross RefCross Ref
  44. Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. 2001. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM'01). 149--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Stutzbach, D., Rejaie, R., Duffield, N., Sen, S., and Willinger, W. 2006. On unbiased sampling for unstructured peer-to-peer networks. In Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement (IMC'06). 27--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Surana, S., Godfrey, B., Lakshminarayanan, K., Karp, R., and Stoica, I. 2006. Load balancing in dynamic structured peer-to-peer systems. In Perform. Eval. Special Issue on Performance Modeling and Evaluation of Peer-to-Peer Computing Systems. 63, 6, 217--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Swart, G. 2004. Spreading the load using consistent hashing: A preliminary report, In Proceedings of the 3rd International Symposium on Parallel and Distributed Computing/3rd International Workshop on Algorithms, Models, and Tools for Parallel Computing on Heterogeneous Networks (ISPDC/HeteroPar'04). IEEE Computer Society, 169--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Triantafillou, P, Xiruhaki, C., Koubarakis, M, and Ntarmos, N. 2003. Towards high performance peer-to-peer content and resource sharing systems. In Proceedings of the 1st Biennial Conference on Innovative Data Systems Research (CIDR'03).Google ScholarGoogle Scholar
  49. Vishnumurthy, V. and Francis, P. 2006. On heterogeneous overlay construction and random node selection in unstructured P2P networks. In Proceedings of the 25th Conference on Computer Communications (IEEE InfoCom'06). IEEE Computer Society.Google ScholarGoogle Scholar
  50. Wang, X., Zhang, Y., Li, X., and Loguinov, D. 2004. On zone-balancing of peer to peer networks: Analysis of random node join. In Proceedings of the Joint Internatioal Conference on Measurement and Modeling of Computers (SIGMETRICS-Performance'04). ACM Press, New York, 211--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Yang, X., and De Veciana, G. 2006. Performance of peer-to-peer networks: Service capacity and role of resource sharing policies. Perform. Eval. 63, 175--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Zhao, Y. B., Huang, L., Stribling, J., Rhea, S. C., Joseph, A. D., and Kubiatowicz, J. 2004. Tapestry: An resilient global-scale overlay for service deployment. IEEE J. Select. Areas Comm. 22, 1, 41--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Zhu, Y., and Hu, Y. 2005. Efficient, proximity-aware load balancing for DHT-based P2P systems. IEEE Trans. Parall. Distrib. Syst. 16, 4, 349--361. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Zipf, G. 1935. The Psycho-Biology of Language. Houghton Mifflin, Boston.Google ScholarGoogle Scholar

Index Terms

  1. Distribution fairness in Internet-scale networks

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in

                  Full Access

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader
                  About Cookies On This Site

                  We use cookies to ensure that we give you the best experience on our website.

                  Learn more

                  Got it!