Abstract
The emergence of programmable network devices and the increasing data traffic of datacenters motivate the idea of in-network computation. By offloading compute operations onto intermediate networking devices (e.g., switches, network accelerators, middleboxes), one can (1) serve network requests on the fly with low latency; (2) reduce datacenter traffic and mitigate network congestion; and (3) save energy by running servers in a low-power mode. However, since (1) existing switch technology doesn't provide general computing capabilities, and (2) commodity datacenter networks are complex (e.g., hierarchical fat-tree topologies, multipath communication), enabling in-network computation inside a datacenter is challenging.
In this paper, as a step towards in-network computing, we present IncBricks, an in-network caching fabric with basic computing primitives. IncBricks is a hardware-software co-designed system that supports caching in the network using a programmable network middlebox. As a key-value store accelerator, our prototype lowers request latency by over 30% and doubles throughput for 1024 byte values in a common cluster configuration. Our results demonstrate the effectiveness of in-network computing and that efficient datacenter network request processing is possible if we carefully split the computation across the different programmable computing elements in a datacenter, including programmable switches, network accelerators, and end hosts.
- Intel DPDK. http://dpdk.org.Google Scholar
- ECMP routing protocol. https://en.wikipedia.org/wiki/Equal-cost_multi-path_routing.Google Scholar
- Zipf's law. https://en.wikipedia.org/wiki/Zipf%27s_law.Google Scholar
- Organizationally unique identifier. https://en.wikipedia.org/wiki/Organizationally_unique_identifier.Google Scholar
- Intel Ethernet Switch FM6000 Series, white paper, 2013.Google Scholar
- Arista 7150 Series Datasheet. https://www.arista.com/assets/data/pdf/Datasheets/7150S_Datasheet.pdf, 2016.Google Scholar
- Microsoft Azure Machine Learning. https://azure.microsoft.com/en-us/services/machine-learning/, 2016.Google Scholar
- OCTEON Development Kits. http://www.cavium.com/octeon_software_develop_kit.html, 2016.Google Scholar
- LiquidIO Server Adapters. http://www.cavium.com/LiquidIO_Server_Adapters.html, 2016.Google Scholar
- XPliant Ethernet Switch Product Family. http://www.cavium.com/XPliant-Ethernet-Switch-Product-Family.html, 2016.Google Scholar
- Google SyntaxNet. https://research.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html, 2016.Google Scholar
- Mellanox NPS-400 Network Processor. http://www.mellanox.com/related-docs/prod_npu/PB_NPS-400.pdf, 2016.Google Scholar
- Netronome NFP-6000 Intelligent Ethernet Controller Family. https://www.netronome.com/media/redactor_files/PB_NFP-6000.pdf, 2016.Google Scholar
- M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data center TCP (DCTCP). In Proceedings of the ACM SIGCOMM 2010 Conference, SIGCOMM '10, pages 63--74, New York, NY, USA, 2010. ACM. ISBN 978--1--4503-0201--2. doi: 10.1145/1851182.1851192. URL http://doi.acm.org/10.1145/1851182.1851192. Google Scholar
Digital Library
- B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMET-RICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMET-RICS '12, pages 53--64, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1097-0. doi: 10.1145/2254756.2254766. URL http://doi.acm.org/10.1145/2254756.2254766. Google Scholar
Digital Library
- P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Izzard, F. Mujica, and M. Horowitz. Forwarding metamorphosis: Fast programmable match-action processing in hardware for SDN. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM, SIGCOMM '13, pages 99--110, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2056-6. doi: 10.1145/2486001.2486011. URL http://doi.acm.org/10.1145/2486001.2486011. Google Scholar
Digital Library
- A. M. Caulfield, E. S. Chung, A. Putnam, H. Angepat, J. Fowers, M. Haselman, S. Heil, M. Humphrey, P. Kaur, J.-Y. Kim, et al. A cloud-scale acceleration architecture. In Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on, pages 1--13. IEEE, 2016. Google Scholar
Cross Ref
- T. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman. Project adam: Building an efficient and scalable deep learning training system. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pages 571--582, 2014.Google Scholar
Digital Library
- B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC '10, pages 143--154, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0036-0. doi: 10.1145/1807128.1807152. URL http://doi.acm.org/10.1145/1807128.1807152. Google Scholar
Digital Library
- P. Costa, A. Donnelly, A. Rowstron, and G. O'Shea. Camdoop: Exploiting in-network aggregation for big data applications. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI'12, pages 3--3, Berkeley, CA, USA, 2012. USENIX Association. URL http://dl.acm.org/citation.cfm?id=2228298.2228302.Google Scholar
Digital Library
- J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Commun. ACM, 51(1):107--113, Jan. 2008. ISSN 0001-0782. doi: 10.1145/1327452.1327492. URL http://doi.acm.org/10.1145/1327452.1327492. Google Scholar
Digital Library
- K. Fall, G. Iannaccone, M. Manesh, S. Ratnasamy, K. Argyraki, M. Dobrescu, and N. Egi. Routebricks: Enabling general purpose network infrastructure. SIGOPS Oper. Syst. Rev., 45(1):112--125, Feb. 2011. ISSN 0163-5980. doi: 10.1145/1945023.1945037. URL http://doi.acm.org/10.1145/1945023.1945037. Google Scholar
Digital Library
- B. Fitzpatrick. Distributed caching with memcached. Linux J., 2004(124):5--, Aug. 2004. ISSN 1075--3583. URL http://dl.acm.org/citation.cfm?id=1012889.1012894.Google Scholar
Digital Library
- A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: A scalable and flexible data center network. In Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication, SIGCOMM '09, pages 51--62, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-594-9. doi: 10.1145/1592568.1592576. URL http://doi.acm.org/10.1145/1592568.1592576. Google Scholar
Digital Library
- S. Han, K. Jang, K. Park, and S. Moon. PacketShader: A GPU-accelerated software router. In Proceedings of the ACM SIGCOMM 2010 Conference, SIGCOMM '10, pages 195--206, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0201-2. doi: 10.1145/1851182.1851207. URL http://doi.acm.org/10.1145/1851182.1851207. Google Scholar
Digital Library
- T. L. Harris. A pragmatic implementation of non-blocking linked-lists. In International Symposium on Distributed Computing, pages 300--314. Springer, 2001. Google Scholar
Cross Ref
- M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed data-parallel programs from sequential building blocks. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems, EuroSys '07, pages 59--72, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-636-3. doi: 10.1145/1272996.1273005. URL http://doi.acm.org/10.1145/1272996.1273005. Google Scholar
Digital Library
- V. Jeyakumar, M. Alizadeh, Y. Geng, C. Kim, and D. Maziéres. Millions of little minions: Using packets for low latency network programming and visibility. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM'14, pages 3--14, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2836-4. doi: 10.1145/2619239.2626292. URL http://doi.acm.org/10.1145/2619239.2626292. Google Scholar
Digital Library
- A. Kaufmann, S. Peter, N. K. Sharma, T. Anderson, and A. Krishnamurthy. High performance packet processing with flexnic. SIGPLAN Not., 51(4):67--81, Mar. 2016. ISSN 0362-1340. doi: 10.1145/2954679.2872367. URL http://doi.acm.org/10.1145/2954679.2872367. Google Scholar
Digital Library
- E. Kohler, R. Morris, B. Chen, J. Jannotti, and M. F. Kaashoek. The Click modular router. ACM Transactions on Computer Systems (TOCS), 18(3):263--297, 2000. Google Scholar
Digital Library
- M. Li, L. Zhou, Z. Yang, A. Li, F. Xia, D. G. Andersen, and A. Smola. Parameter server for distributed machine learning. In Big Learning NIPS Workshop, volume 6, page 2, 2013.Google Scholar
- X. Li, R. Sethi, M. Kaminsky, D. G. Andersen, and M. J. Freedman. Be fast, cheap and in control with SwitchKV. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), pages 31--44, Santa Clara, CA, Mar. 2016. USENIX Association. ISBN 978-1-931971-29-4. URL https://www.usenix.org/conference/nsdi16/technical-sessions/presentation/li-xiaozhou.Google Scholar
Digital Library
- H. Lim, D. Han, D. G. Andersen, and M. Kaminsky. Mica: A holistic approach to fast in-memory key-value storage. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 429--444, Seattle, WA, 2014. USENIX Association. ISBN 978-1-931971-09-6. URL https://www.usenix.org/conference/nsdi14/technical-sessions/presentation/lim.Google Scholar
Digital Library
- L. Mai, L. Rupprecht, A. Alim, P. Costa, M. Migliavacca, P. Pietzuch, and A. L. Wolf. NetAgg: Using middleboxes for application-specific on-path aggregation in data centres. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies, CoNEXT '14, pages 249--262, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-3279-8. doi: 10.1145/2674005.2674996. URL http://doi.acm.org/10.1145/2674005.2674996. Google Scholar
Digital Library
- Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In Proceedings of the 7th ACM European Conference on Computer Systems, EuroSys'12, pages 183--196, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1223-3. doi: 10.1145/2168836.2168855. URL http://doi.acm.org/10.1145/2168836.2168855. Google Scholar
Digital Library
- N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner. Open-Flow: Enabling innovation in campus networks. SIGCOMM Comput. Commun. Rev., 38(2):69--74, Mar. 2008. ISSN 0146-4833. doi: 10.1145/1355734.1355746. URL http://doi.acm.org/10.1145/1355734.1355746. Google Scholar
Digital Library
- M. M. Michael. High performance dynamic lock-free hash tables and list-based sets. In Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures, pages 73--82. ACM, 2002. Google Scholar
Digital Library
- J. P. Morrison. Flow-Based Programming, 2Nd Edition: A New Approach to Application Development. CreateSpace, Paramount, CA, 2010. ISBN 1451542321, 9781451542325.Google Scholar
- R. Niranjan Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat. PortLand: A scalable fault-tolerant layer 2 data center network fabric. In Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication, SIGCOMM '09, pages 39--50, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-594-9. doi: 10.1145/1592568.1592575. URL http://doi.acm.org/10.1145/1592568.1592575. Google Scholar
Digital Library
- R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, et al. Scaling Memcache at facebook. In Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 385--398, 2013.Google Scholar
Digital Library
- S. Peter, J. Li, I. Zhang, D. R. K. Ports, D. Woos, A. Krishnamurthy, T. Anderson, and T. Roscoe. Arrakis: The operating system is the control plane. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pages 1--16, Broomfield, CO, Oct. 2014. USENIX Association. ISBN 978-1-931971-16-4. URL https://www.usenix.org/conference/osdi14/technical-sessions/presentation/peter.Google Scholar
Digital Library
- P. M. Phothilimthana, T. Jelvis, R. Shah, N. Totla, S. Chasins, and R. Bodik. Chlorophyll: Synthesis-aided compiler for low-power spatial architectures. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, pages 396--407, New York, NY, USA, 2014. ACM. ISBN 978--1--4503--2784--8. doi: 10.1145/2594291.2594339. URL http://doi.acm.org/10.1145/2594291.2594339. Google Scholar
Digital Library
- L. Popa, N. Egi, S. Ratnasamy, and I. Stoica. Building extensible networks with rule-based forwarding. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, OSDI'10, pages 379--392, Berkeley, CA, USA, 2010. USENIX Association. URL http://dl.acm.org/citation.cfm?id=1924943.1924970.Google Scholar
Digital Library
- C. Raiciu, S. Barre, C. Pluntke, A. Greenhalgh, D. Wischik, and M. Handley. Improving datacenter performance and robustness with multipath TCP. In Proceedings of the ACM SIGCOMM 2011 Conference, SIGCOMM '11, pages 266--277, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0797-0. doi: 10.1145/2018436.2018467. URL http://doi.acm.org/10.1145/2018436.2018467. Google Scholar
Digital Library
- A. Roy, H. Zeng, J. Bagga, G. Porter, and A. C. Snoeren. Inside the social network's (datacenter) network. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, SIGCOMM '15, pages 123--137, New York, NY, USA, 2015. ACM. ISBN 978-1-4503-3542-3. doi: 10.1145/2785956.2787472. URL http://doi.acm.org/10.1145/2785956.2787472. Google Scholar
Digital Library
- B. Schwartz, A. W. Jackson, W. T. Strayer, W. Zhou, R. D. Rockwell, and C. Partridge. Smart packets: Applying active networks to network management. ACM Trans. Comput. Syst., 18(1):67--88, Feb. 2000. ISSN 0734--2071. doi: 10.1145/332799.332893. URL http://doi.acm.org/10.1145/332799.332893. Google Scholar
Digital Library
- O. Shalev and N. Shavit. Split-ordered lists: Lock-free extensible hash tables. J. ACM, 53(3), May 2006. Google Scholar
Digital Library
- Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI'08, pages 1--14, Berkeley, CA, USA, 2008. USENIX Association. URL http://dl.acm.org/citation.cfm?id=1855741.1855742.Google Scholar
Digital Library
- M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. In Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud'10, pages 10--10, Berkeley, CA, USA, 2010. USENIX Association. URL http://dl.acm.org/citation.cfm?id=1863103.1863113.Google Scholar
Digital Library
Index Terms
IncBricks: Toward In-Network Computation with an In-Network Cache
Recommendations
IncBricks: Toward In-Network Computation with an In-Network Cache
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating SystemsThe emergence of programmable network devices and the increasing data traffic of datacenters motivate the idea of in-network computation. By offloading compute operations onto intermediate networking devices (e.g., switches, network accelerators, ...
IncBricks: Toward In-Network Computation with an In-Network Cache
Asplos'17The emergence of programmable network devices and the increasing data traffic of datacenters motivate the idea of in-network computation. By offloading compute operations onto intermediate networking devices (e.g., switches, network accelerators, ...
User-assisted in-network caching in information-centric networking
In information-centric networking, in-network caching has the potential to improve network efficiency and content distribution performance by satisfying user requests with cached content rather than downloading the requested content from remote sources. ...







Comments