← Work2026某大型制造业集团Domestic inference resource pool

Domestic AI inference resource pool

Build group-level AI infrastructure across compute, network, and storage, enabling unified management, scheduling, and service delivery.

Manufacturing group / Enterprise AI platform国产化算力

Context

Domestic inference resource pool

Manufacturing group / Enterprise AI platform · 某大型制造业集团 · 2026

The manufacturing group spans R&D, production, quality, supply chain, equipment operations, and digital operations, and needed a unified AI infrastructure capability.

The customer planned to build a domestic AI inference resource pool for headquarters, factories, departments, and production systems.

The pool needed to support models such as ChatGLM, Baichuan2, LLaMA 34B, and LLaMA2 13B/70B, while integrating with the group cloud management center and AI platform.

Solution

How Ouryun ships Domestic inference resource pool for 某大型制造业集团

The solution was designed across compute, network, and storage to build a high-performance domestic AI resource pool for large-model inference.

KUNLUN compute cluster

Multiple KUNLUN servers form a dense inference cluster, with each server providing 6P@FP16 compute and 1024G HBM.

↗

High-speed interconnect

392GB/s intra-node chip interconnect and 16 × 200GE RoCE inter-node links support large-model workloads.

↗

Multi-plane networking

Business, sample, and parameter planes are separated, with a 200GE non-blocking RoCE network for parameter traffic.

↗

Spine-Leaf and M-LAG

A two-layer Spine-Leaf topology with M-LAG redundancy improves east-west throughput and reliability.

↗

Distributed storage pool

Distributed storage balances high I/O and bandwidth for model files, samples, inference data, production data, and platform runtime data.

↗

Delivery path

From connect to ship, in four steps

Compute pool

Deploy domestic inference servers as a centralized compute pool.

↗

Network

Build separated network planes for low-latency, high-throughput communication.

↗

Storage

Deploy distributed storage for models, samples, inference, and production data.

↗

Management

Integrate with hybrid-cloud management and the group AI platform for unified management and scheduling.

↗

Full project record

Customer profile

A unified domestic AI inference resource pool for a large manufacturing group, supporting multiple large models and group-level AI platform operations.

Needs

The manufacturing group spans R&D, production, quality, supply chain, equipment operations, and digital operations, and needed a unified AI infrastructure capability. The customer planned to build a domestic AI inference resource pool for headquarters, factories, departments, and production systems. The pool needed to support models such as ChatGLM, Baichuan2, LLaMA 34B, and LLaMA2 13B/70B, while integrating with the group cloud management center and AI platform.

Solution

The solution was designed across compute, network, and storage to build a high-performance domestic AI resource pool for large-model inference.

Impact

17.5PFLOPS FP16 AI compute; 200GE RoCE parameter network; 1024G Per-node HBM memory; 统一调度 Unified group-level scheduling

Impact

Numbers that prove value

0PFLOPS

FP16 AI compute

0GE

RoCE parameter network

Per-node HBM memory

统一调度

Unified group-level scheduling

Domestic inference resource pool

Bring this capability into your business

Key capabilitiesKUNLUN compute clusterHigh-speed interconnectMulti-plane networkingSpine-Leaf and M-LAG

Get an Enterprise AI Implementation Plan→

More work

More Ouryun case studies

View all cases →

Manufacturing AI quality inspection

An AOI + AI inspection appliance for PCB production that performs AI-based second-pass review on AOI NG images and reduces manual review load.

View case→

DeepSeek private compute foundation

A 6-node, 48-GPU private compute foundation for DeepSeek-R1 671B, including API publishing, AI security governance, and model security assessment.

View case→

OURYUN · Connect · Govern · Deploy · Ship ·