Sub-RTT Congestion Control for Inter-Datacenter Networks
With the explosive growth in the scale and complexity of large language models (LLMs), there is an urgent need to extend training and inference workloads from within a single data center to across multiple data centers. However, this also introduces new challenges for network transport protocols. To...
Saved in:
| Published in | Proceedings - International Conference on Network Protocols pp. 1 - 6 |
|---|---|
| Main Authors | , , , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
22.09.2025
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2643-3303 |
| DOI | 10.1109/ICNP65844.2025.11192335 |
Cover
| Summary: | With the explosive growth in the scale and complexity of large language models (LLMs), there is an urgent need to extend training and inference workloads from within a single data center to across multiple data centers. However, this also introduces new challenges for network transport protocols. To address these issues, we propose SRCC (Sub-RTT Congestion Control), a method designed for inter-datacenter networks. Specifically, SRCC introduces a flowset-based mechanism along with shared node tables, enabling Datacenter Interconnect (DCI) switches to be aware of the path status of each flow. By leveraging information shared among different flows, SRCC can accurately adjust the sending rate at a sub-RTT timescale, thereby significantly improving network performance. Building on this approach, we design detailed mechanisms to address the following challenges: (1) applying INT technology in wide-area networks; (2) acquiring INT information with low overhead; and (3) achieving precise congestion window adjustments under sub-RTT perception.We conducted large-scale simulations using NS3, and the experimental results show that our scheme reduces the average FCT slowdown by 44.17% and 53.86% compared to HPCC and DCTCP, respectively. |
|---|---|
| ISSN: | 2643-3303 |
| DOI: | 10.1109/ICNP65844.2025.11192335 |