TY - GEN
T1 - On mitigating TCP Incast in Data Center Networks
AU - Zhang, Yan
AU - Ansari, Nirwan
PY - 2011
Y1 - 2011
N2 - TCP Incast, also known as TCP throughput collapse, is a term used to describe a link capacity under-utilization phenomenon in certain many-to-one communication patterns, typically in many datacenter applications. The main root cause of TCP Incast analyzed by prior works is attributed to packet drops at the congestion switch that result in TCP timeout. Congestion control algorithms have been developed to reduce or eliminate packet drops at the congestion switch. In this paper, the performance of Quantized Congestion Notification (QCN) with respect to the TCP incast problem during data access from clustered servers in datacenters are investigated. QCN can effectively control link rates very rapidly in a datacenter environment. However, it performs poorly when TCP Incast is observed. To explain this low link utilization, we examine the rate fluctuation of different flows within one synchronous reading request, and find that the poor performance of TCP throughput with QCN is due to the rate unfairness of different flows. Therefore, an enhanced QCN congestion control algorithm, called fair Quantized Congestion Notification (FQCN), is proposed to improve fairness of multiple flows sharing one bottleneck link. We evaluate the performance of FQCN as compared to that of QCN in terms of fairness and convergence with four simultaneous and eight staggered source flows. As compared to QCN, fairness is improved greatly and the queue length at the bottleneck link converges to the equilibrium queue length very fast. The effects of FQCN to TCP throughput collapse are also investigated. Simulation results show that FQCN significantly enhances TCP throughput performance in a TCP Incast setup.
AB - TCP Incast, also known as TCP throughput collapse, is a term used to describe a link capacity under-utilization phenomenon in certain many-to-one communication patterns, typically in many datacenter applications. The main root cause of TCP Incast analyzed by prior works is attributed to packet drops at the congestion switch that result in TCP timeout. Congestion control algorithms have been developed to reduce or eliminate packet drops at the congestion switch. In this paper, the performance of Quantized Congestion Notification (QCN) with respect to the TCP incast problem during data access from clustered servers in datacenters are investigated. QCN can effectively control link rates very rapidly in a datacenter environment. However, it performs poorly when TCP Incast is observed. To explain this low link utilization, we examine the rate fluctuation of different flows within one synchronous reading request, and find that the poor performance of TCP throughput with QCN is due to the rate unfairness of different flows. Therefore, an enhanced QCN congestion control algorithm, called fair Quantized Congestion Notification (FQCN), is proposed to improve fairness of multiple flows sharing one bottleneck link. We evaluate the performance of FQCN as compared to that of QCN in terms of fairness and convergence with four simultaneous and eight staggered source flows. As compared to QCN, fairness is improved greatly and the queue length at the bottleneck link converges to the equilibrium queue length very fast. The effects of FQCN to TCP throughput collapse are also investigated. Simulation results show that FQCN significantly enhances TCP throughput performance in a TCP Incast setup.
KW - Data Center Networks (DCN)
KW - Quantized Congestion Notification (QCN)
KW - TCP Incast
KW - TCP throughput collapse
KW - congestion control
KW - fairness
UR - http://www.scopus.com/inward/record.url?scp=79960869483&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79960869483&partnerID=8YFLogxK
U2 - 10.1109/INFCOM.2011.5935217
DO - 10.1109/INFCOM.2011.5935217
M3 - Conference contribution
AN - SCOPUS:79960869483
SN - 9781424499212
T3 - Proceedings - IEEE INFOCOM
SP - 51
EP - 55
BT - 2011 Proceedings IEEE INFOCOM
T2 - IEEE INFOCOM 2011
Y2 - 10 April 2011 through 15 April 2011
ER -