The Linux Foundation Projects
Skip to main content
BlogEvent

SONiC AI Summit 2025 Recap

By August 17, 2025No Comments

On August 9, 2025, SONiC AI Summit 2025 was successfully held in Beijing, China. The conference brought together top global public cloud service providers like Alibaba Cloud, Microsoft, Tencent, and ByteDance, along with chip and ODM manufacturers such as NVIDIA, Broadcom, Cisco, and Accton. The participants engaged in in-depth discussions on network architecture innovation in the AI era, sharing cutting-edge technical practices and industry trends.

Key Technical Highlights

  • AI Cloud Infrastructure: Large-scale deployment practices and network optimization strategies.
  • Chip-Network Synergy: Deep integration of AI-specific chip capabilities with network architecture.
  • Protocol Innovation: Protocols like SRv6, SyncMesh, and SUE are driving network intelligence.
  • Ecosystem Building: Open architecture and industry collaboration, with Alibaba Cloud introducing its UPN architecture, to accelerate the universal application of AI computing.

Alibaba Cloud: HPN Architecture Evolution, Defining Network Standards for the AI Large Model Era

Dennis Cai (Vice President of Alibaba Cloud, Head of Infrastructure Network R&D) delivered a keynote speech titled “Network Architecture Evolution in the Era of AI Large Models.” He provided an in-depth analysis of the design philosophy behind Alibaba Cloud’s HPN (High Performance Networking) architecture, technical breakthroughs in its end-network Synergy system, and the future evolution & ecosystem development trends of the HPN architecture.

During the speech, he unveiled the next-generation UPN (Ultra Performance Network) architecture. Based on cutting-edge single tier ETH+ and optical interconnect technology, this architecture adopts a fully decoupled design for the first time. It completely eliminates the risk of global paralysis caused by a single switch node failure which is a common vulnerability in traditional “mini-computer-like” architectures. UPN not only offers higher reliability and scalability, but can also is deployed directly on general-purpose hardware, which is significantly reducing construction complexity and overall cost. It represents a new paradigm for data center Scale Up networks.

Tencent Cloud: Xingmai 3.0, Building a High-Performance Network Foundation

Yachen Wang (Vice President of Tencent Cloud, General Manager of Tencent Network) gave a keynote speech titled “Tencent’s New Generation High-Performance Network – Xingmai 3.0.” He explained how the architecture achieves a larger network scale with fewer layers and provides stronger single-node computing power through Scale Up, thereby comprehensively optimizing in both training and inference.

Microsoft: IPv6 and AI Back-end Network Innovation

Guohan Lu (Microsoft Partner Software Engineer) delivered a keynote speech titled “Key Technologies for Supporting Next-Generation Large-Scale AI Back-end Networks.” He revealed the next-generation AI back-end network architecture and demonstrated key features based on IPv6 Segment Routing (SRv6), High-Frequency Streaming Telemetry (HFST), and Trimming technology, driving the intelligent upgrade of data center networks.

ByteDance: SyncMesh Protocol Enables AI Network Load Balancing

Yongcan Wang (Head of ByteDance’s White Box Team) gave a keynote speech titled “AI Network Practice of Global Load Balancing (GLB) Based on the SyncMesh Protocol.” He introduced the AI network’s GLB solution, which uses the SyncMesh control plane protocol to achieve sub-microsecond link oscillation convergence and bandwidth-symmetric load balancing, solving the challenges of AI computing power scheduling.

Broadcom: Scale Up Ethernet Drives Memory-Semantic AI Clusters

Mohan Kalkunte (Broadcom Vice President of Architecture and Technology) gave a keynote speech titled “Scale Up Ethernet: Driving Memory-Semantic AI Clusters.” He introduced the Scale Up Ethernet (SUE) framework, designed specifically for memory-semantic architectures, providing a low-latency, high-throughput transport solution for AI clusters.

Cisco: A New Paradigm for a Unified Ethernet AI Architecture

Eli Stein (Cisco Vice President of Product and Marketing) delivered a keynote speech titled “Towards a Unified Ethernet-based AI Network Architecture.” He discussed the trend of integrating Scale Up and Scale Out network architectures and proposed the core value of Ethernet as a unified, efficient AI infrastructure, reshaping data center interconnect standards.

Accton: ODM Leads a New Era of Open Infrastructure in the AI and Cloud Age

Jun Shi (Accton CEO & President) delivered a keynote speech titled “ODM: The New System Supplier of Open Infrastructure in the Cloud Computing and AI Era.” He explained how Accton is becoming a key driver of network innovation in the AI and cloud era through three core strategies.

SONiC Scale Up Working Group

Microsoft Partner Engineer Guohan Lu, Alibaba Cloud Senior Staff Engineer Eddie Ruan, and Alibaba Cloud Engineer Yuqing Zhao jointly introduced the SONiC Scale Up Working Group. The group aims to support Scale Up networks on the SONiC system and promote the standardization and large-scale deployment of next-generation AI network architectures. The three speakers detailed the working group’s targets and roadmap, current development status of the Scale Up architecture specification document, and some key highlights in this document. They invited Tencent and ByteDance to join the working group at the end. Microsoft, Alibaba Cloud, Tencent and Bytedance would work closely together in this working group to advance Ethernet based Scale Up AI Infrastructure.

Roundtable Discussion: The Future Choice for AI Network Architecture

In the second half of the conference, experts from Alibaba Cloud, Microsoft, NVIDIA,  Broadcom, Cisco and Accton engaged in a lively debate on the following hot topics:

  • The Balancing Act of Scale Up vs. Scale Out:
    • How to balance vertical and horizontal scaling in AI infrastructure?
    • What trade-offs and synergies are expected to emerge?
  • Priorities and Future Planning for AI Infrastructure:
    • In the rapidly evolving AI field, what are the most critical goals for the 2024 product roadmap?
    • How will technological breakthroughs and ecosystem collaboration jointly drive industry progress?

Conclusion

This conference not only showcased the latest technological breakthroughs in the field of AI networks, but also provided a clear path for building efficient, flexible, and scalable AI infrastructure through the integration of industry, academia, and research. In the future, with the continuous evolution of technologies such as Ethernet, IPv6, Chip technologies, AI networks will enter a new era of unification and intelligence.