Improving Fault Tolerance and Performance of Data Center Networks
Author | : Vincent Liu |
Publisher | : |
Total Pages | : 96 |
Release | : 2016 |
ISBN-10 | : OCLC:981509536 |
ISBN-13 | : |
Rating | : 4/5 ( Downloads) |
Download or read book Improving Fault Tolerance and Performance of Data Center Networks written by Vincent Liu and published by . This book was released on 2016 with total page 96 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data center networks are a key component to the explosive growth of cloud computing---enabling the utilization of tens to hundreds of thousands of co-located servers for large-scale computing and services. As applications and data sets continue to grow rapidly, the challenge for data center networks is to keep pace---by providing enough bandwidth while also lowering costs, increasing flexibility, and maintaining reliability. My thesis is that a key part of the answer is the network's wiring topology: topology has foundational cross-layer effects, and a small amount of intentional asymmetry in the topology can help data center networks meet that challenge. I present two complementary innovations that demonstrate this. The first, F10, is a co-design of the network topology and failover protocols to provide efficient, near-instantaneous, fine-grained, and localized recovery and rebalancing for common-case network failures. My results show that following network link and switch failures, F10 has 1/7th the packet loss of current schemes. The second innovation, Subways, proposes and evaluates a new method to add network capacity by connecting multiple network links per server in an overlapping topology. Using a simulation-based methodology, my work shows that Subways offers substantial performance benefits for popular application workloads: up to a 3.1x speedup in MapReduce and a 2.5x throughput improvement in memcache for a fixed average request latency, relative to an equivalent-bandwidth network that differs only in its wiring.