Domestic AI inference resource pool
Build group-level AI infrastructure across compute, network, and storage, enabling unified management, scheduling, and service delivery.
Domestic inference resource pool
Manufacturing group / Enterprise AI platform · 某大型制造业集团 · 2026
The manufacturing group spans R&D, production, quality, supply chain, equipment operations, and digital operations, and needed a unified AI infrastructure capability.
The customer planned to build a domestic AI inference resource pool for headquarters, factories, departments, and production systems.
The pool needed to support models such as ChatGLM, Baichuan2, LLaMA 34B, and LLaMA2 13B/70B, while integrating with the group cloud management center and AI platform.
How Ouryun ships Domestic inference resource pool for 某大型制造业集团
The solution was designed across compute, network, and storage to build a high-performance domestic AI resource pool for large-model inference.
KUNLUN compute cluster
Multiple KUNLUN servers form a dense inference cluster, with each server providing 6P@FP16 compute and 1024G HBM.
High-speed interconnect
392GB/s intra-node chip interconnect and 16 × 200GE RoCE inter-node links support large-model workloads.
Multi-plane networking
Business, sample, and parameter planes are separated, with a 200GE non-blocking RoCE network for parameter traffic.
Spine-Leaf and M-LAG
A two-layer Spine-Leaf topology with M-LAG redundancy improves east-west throughput and reliability.
Distributed storage pool
Distributed storage balances high I/O and bandwidth for model files, samples, inference data, production data, and platform runtime data.
From connect to ship, in four steps
Compute pool
Deploy domestic inference servers as a centralized compute pool.
Network
Build separated network planes for low-latency, high-throughput communication.
Storage
Deploy distributed storage for models, samples, inference, and production data.
Management
Integrate with hybrid-cloud management and the group AI platform for unified management and scheduling.
Customer profile
A unified domestic AI inference resource pool for a large manufacturing group, supporting multiple large models and group-level AI platform operations.
Needs
The manufacturing group spans R&D, production, quality, supply chain, equipment operations, and digital operations, and needed a unified AI infrastructure capability. The customer planned to build a domestic AI inference resource pool for headquarters, factories, departments, and production systems. The pool needed to support models such as ChatGLM, Baichuan2, LLaMA 34B, and LLaMA2 13B/70B, while integrating with the group cloud management center and AI platform.
Solution
The solution was designed across compute, network, and storage to build a high-performance domestic AI resource pool for large-model inference.
Impact
17.5PFLOPS FP16 AI compute; 200GE RoCE parameter network; 1024G Per-node HBM memory; 统一调度 Unified group-level scheduling
Numbers that prove value
Bring this capability into your business
More Ouryun case studies
Manufacturing AI quality inspection
An AOI + AI inspection appliance for PCB production that performs AI-based second-pass review on AOI NG images and reduces manual review load.
DeepSeek private compute foundation
A 6-node, 48-GPU private compute foundation for DeepSeek-R1 671B, including API publishing, AI security governance, and model security assessment.