How Many 400G OSFP SiPh LPOs in Huawei Al CloudMatrix 384 Super-node?

Date 05/22/2025

Deep dive into Huawei's AI super-node: 384 Ascend 910C chips, 6912×400G OSFP SiPh LPO modules (1:18 ratio), 1.7x faster than NVIDIA. Learn why 68.2% failures stem from optics & how QSFPTEK cuts costs by 69.8%.

On May 14, 2025, the "2025 Chip and Optical Forum" hosted by HiSilicon and organized by ICC was held at the Crowne Plaza Wuhan Optics Valley. The conference focused on developing intelligent optical interconnection technology, shared cutting-edge achievements, and explored industry trends. At the forum, Huawei Cloud, HiSilicon, iFlytek, etc., all mentioned the computing power super-node of AlCloudMatrix 384.

What is AlCloudMatrix 384 Super-node?

The Huawei CloudMatrix 384 super-node is a key technological breakthrough of Huawei AI computing infrastructure, mainly used to solve the communication efficiency problem of large-scale AI clusters. It was released on April 10, 2025 and has been launched on a large scale in the Wuhu Data Center. 384 means that the supernode contains 384 computing chips, that is, 384 Ascend 910c computing chips, referred to as CM384.

As Artificial Intelligence (AI) has become the key power driving industry transformation, how to enable AI to walk out from the laboratory to industry has become a "must-answer question" in the development of the times. CloudMatrix 384 supernode is the answer given by Huawei Cloud.

What Makes CM384 Different?

CloudMatrix 384 super-node uses 6912 x 400G OSFP silicon photonic (SiPh) Linear Drive Pluggable Optics (LPO) optical modules and 3168 fibers, connecting 384 Ascend 910C computing chips through a complete mesh interconnection architecture. Unlike the all-electric communication solution adopted by NVIDIA's NVL72 super-node, Huawei fully utilizes the high-bandwidth and low-latency features of optical transceiver technology to support long-haul transmission, breaking through the physical limitations of traditional electrical communications and achieving 1.7 times the computing power and 3.6 times the HBM storage. The optical transceivers ratio reaches 1:18.

How Many 400G OSFP SiPh LPOs Does AlCloudMatrix 384 Super-node Use?

3072 x 400G OSFP SiPh LPOs are Deployed in Computer Servers

In iFlytek's speech, SemiAnalysis's analysis data on CM384 was cited. We then extracted the optical module information from this data.

The CM384 super-node contains 48 computing servers (chassis). Each server has 8 x 910c computing chips, 56 x 400G silicon photonic LPOs for scale-up, and 8 x 400G silicon photonic LPOs for scale-out. The 48 servers have a total of 2688 scale-up 400G SiPh LPOs and 384 scale-out 400G SiPh LPOs, totaling 3,072 x 400G OSFP silicon photonic LPO modules. The table below also includes QSFP112 200G optical modules.

Table 1 - 400G OSFP SiPh LPOs and other components deployed in computing server chassis for CM384 super-node

Item	Unit Power (W)	Quantity	Extended Power (W)	Server Quantity
Computer Chassis				x48
Ascend 910c	750	8	6000
AI Accelerator Baseboard	300	1	300
Motherboard	95	1	95
Kunpeng 920 CPU	180	2	360
PCIe 4.0 Retlmers on Baseboard	10	8	80
PCIe 4.0 Switches	50	4	200
RAM 6AGB	10	32	320
Fans	60	20	1200
M.2 NVMe Boot Drive	10	2	20
2.5* NMe	20	4	80
Storage backplane	75	1	75
OSFP 400Gbps SiPh LPO Scale Up Transceiver	7	56	364
400GE Scale Out Transceiver	26	8	207
OSFP 400Gbps SiPh LPO Scale Out Transceiver	7	8	52
200Gbps Frontend DPU	25	2	50
QSFP112 multimode 200 Gbps Transceiver	7	2	14

3840 x 400G OSFP SiPh LPOs are Deployed in Scale-up and Out Switches

The CM384 super-node also contains scale-up and scale-out switches. The 4 scale-up switches have 2688 scale-up 400G OSFP SiPh LPOs and 1152 scale-out OSFP 400G SiPh LPOs, totaling 3,840 x 400GBASE OSFP silicon photonic modules.

Table 2 - 400G OSFP SiPh LPOs and other components deployed in scale-up and out switches for CM384 super-node

Item	Unit Power (W)	Quantity	Extended Power (W)	Server Quantity
Scale-up switch				x4
Switch ASICs	200	28	5600
Fans	61.2	50	3060
OSFP 400Gbps SiPh LPO Scale-Up Transceiver	6.5	672	4992
Scale-out switch				x1.5
Switch ASICs	200	28	6400
Fans	61.2	50	3060
OSFP 400Gbps SiPh LPO Scale-Out Transceiver	6.5	768	4992

A Total of 6912 x 400G OSFP SiPh LPOs are Deployed in CloudMatrix 384 Super-node

CM384 contains 6912 x 400G OSFP silicon photonics LPOs

Figure 1 - CM384, with 6912 x 400G OSFP silicon photonics LPO optical modules, a ratio of 1:18

If we calculate the total number of OSFP 400G optical modules involved here, the CM384 supernode includes 384 x 910c computing chips and 6912 x 400G silicon photonics LPO optical modules. The ratio of 400G OSFP SiPh optical modules to computing chips is 18x.

Table 3 - 384 x 910c computing chips and 6912 x 400G OSFP SiPh LPOs=1:18

Computing Chips and Transceivers		Quantity	Computing Servers Quantity	Sum		Ratio
Computing Power	910c computing chips	8	48	384	384 computing chips	384:6912=1:18
	Scale-Up 400G SiPh LPO Transceiver	56	48	2688	6912 400G SiPh LPO Transceivers
	Scale-Out 400G SiPh LPO Transceiver	8	48	384
Scale-up switch	Scale-Up 400G SiPh LPO Transceiver	672	4	2688
Scale-out switch	Scale-Out 400G SiPh LPO Transceiver	768	1.5	1152

Transceivers Share 68.2% Failure Rate in iFlytek Ten Thousand Level Cluster

The above is about the super-node based on a 384 computing server chassis. iFlytek also provided the failure rate data of its tens of thousands of clusters, which have been running for one year. It has hidden the absolute value of the coordinate axis, so let's look at the relative data.

Optical module has the highest failure rate in the iFlytek cluster

Figure 2 - The iFlytek Ten thousand level cluster has been running for one year, and the optical module has the highest failure rate.

The failure rate of optical modules accounts for 68.2%

Figure 3 - The failure rate of optical modules accounts for 68.2%, the primary source of failure.

What iFlytek provides is the running data of tens of thousands of clusters. Huawei Cloud has estimated the impacts of optical transceivers’ reliability on training in a larger-scale network.

If the scale of the computing power cluster is further expanded from 10,000 chassis to 100,000 chassis, the number of 400G SiPh LPO optical modules required for complete configuration is 235.9296. According to the existing 10,000-chassis failure rate, there will be seven flash disconnects per hour, and each flash disconnect will force the training time to be extended, increasing the training expense.

Dirty Brought by Optics Connectivity is the Primary Cause For High Transceiver Failure Rate

According to Huawei's statistics, the failure rate caused by the transceiver itself is relatively low. However, the failure brought by dirty optical connectivity is the primary. Detecting the location of Fresnel reflection peaks through the OTDR time domain reflection, the dirty position of the active connector can be determined, reducing the failure rate by 70% to 80%.

Optical module failures caused by dirty connections account for 64.7%

Figure 4 - Optical module failures caused by dirty connections account for 64.7%.

In a larger-scale computing power network, the reliability of optical transceivers is vital.

A Case Study - QSFPTEK Transceivers'Failure Rate In a Three-Year Round Medium Data Center Deployment

The optical module failure rate discussed in the above forum is the proportion of optical module failures in ultra-large-scale cloud computing network training. Let us look at the failure conditions of various components in enterprise medium-sized data center applications.

QSFPTEK has 10+ years of R&D experience and industry-leading experience supporting global enterprise business success in optical transceivers. QSFPTEK has fulfilled 31,000+ orders with 1,000+ successful projects and is favored by 300+ SMBs/Telcos/MNOs/DCs in over 200 countries. According to the average statistics collected from our medium-scale data center clients over three years, the 100G/200G optical transceivers (provided by QSFPTEK) failure rate is 19.4%, and the 40G and other medium-to-low speed optical modules failure rate is 16.3%.

The failure rate of 100G 200G optical modules accounts for 19.4%

Figure 5 - The failure rate of QSFPTEK 100G/200G high-speed optical modules accounts for 19.4% in a three-year round medium data center deployment

QSFPTEK Reduce Your 100G Optics Expense from 18.57% to 69.82%

The table below lists the main 100G transceiver models priced from QSFPTEK, FS.COM, Fluxlight, and Naddod, some of the primary compatible optical transceiver brands popular in the Google SERP. As we see, QSFPTEK has a significant advantage in terms of price.

If you take the lowest price and the highest price from the three brands other than QSFPTEK and use them as the benchmark to calculate the discount ratio of QSFPTEK's price, you will find that QSFPTEK can save you at least 18.57% and up to 69.82% of your 100G module expenses.

Table 4 - The 100G optics price per piece by QSFPTEK, FS.COM, Fluxlight and Naddod

Source	QSFP-100G-SR4-S	QSFP-40/100-SRBD	QSFP-100G-PSM4-S	QSFP-100G-CWDM4-S	QSFP-100G-LR4-S
qsfptek.com	$39.90	$439.90	$153.90	$83.90	$199.90
fs.com	$99.00	$549	$469.00	$209.00	$399.00
fluxlight.com	$79.99	$899	$399.99	$199.99	$349.99
naddod.com	$49.00	N/A	$199.00	$129.00	$279.00
Mini. price saving ratio	18.57%	19.87%	22.66%	34.96%	28.35%
Max. price saving ratio	59.7%	51.07%	67.19%	59.86%	49.9%
Source	QSFP-100G-ER4L-S	QSFP-100G-ZR4-S	QSFP-100G-DR-S	QSFP-100G-FR-S	QSFP-100G-LR-S
qsfptek.com	$679.00 (APD) $1043.9 (SOA)	$1,699.90	$189.90	$219.00	$223.90
fs.com	$1599 (APD)	$2,699.00	$369.00	$419.00	$489.00
fluxlight.com	$2,249.99	$4,299.99	$329.99	$375.99	$439.99
naddod.com	$1599/$1799 (APD)	$3,199.00	$239.00	$271.00	$311.00
Mini. price saving ratio	57.54%	37.05%	20.54%	19.19%	28.01%
Max. price saving ratio	69.82%	60.47%	48.54%	47.73%	54.21%

While cost is important, reliability of optical transceivers is also vital. Each module from QSFPTEK has undergone a complete testing process in our lab, from standardized production line, rigorous performance test to On-Site compatibility test. Welcome to check our quality assurance program.