That's 300GB/s slower than my old Mac Studio (M1 Ultra). Memory speeds in 2025 remain thouroughly unimpressive outside of high-end GPUs and fully integrated systems.
The server systems have that much memory bandwidth per socket. Also, that generation supports DDR5-6400 but they were using DDR5-5200. Using the faster stuff gets you 614GB/s per socket, i.e. a dual socket system with DDR5-6400 is >1200GB/s. And in those systems that's just for the CPU; a GPU/accelerator gets its own.
The M1 Ultra doesn't have 800GB/s because it's "integrated", it simply has 16 channels of DDR5-6400, which it could have whether it was soldered or not. And none of the more recent Apple chips have any more than that.
It's the GPUs that use integrated memory, i.e. GDDR or HBM. That actually gets you somewhere -- the RTX 5090 has 1.8TB/s with GDDR7, the MI300X has 5.3TB/s with HBM3. But that stuff is also more expensive which limits how much of it you get, e.g. the MI300X has 192GB of HBM3, whereas normal servers support 6TB per socket.
And it's the same problem with Apple even though there's no great reason for it to be. The 2019 Intel Xeon Mac Pro supported 1.5TB of RAM -- still in slots -- but the newer ones barely reach a third of that at the top end.
> The M1 Ultra doesn't have 800GB/s because it's "integrated", it simply has 16 channels of DDR5-6400, which it could have whether it was soldered or not.
The M1 Ultra has LPDDR5, not DDR5. And the M1 Ultra was running its memory at 6400MT/s about two and a half years before any EPYC or Xeon parts supported that speed—due in part to the fact that the memory on a M1 Ultra is soldered down. And as far as I can tell, neither Intel nor AMD has shipped a CPU socket supporting 16 channels of DRAM; they're having enough trouble with 12 channels per socket often meaning you need the full width of a 19-inch rack for DIMM slots.
LPDDR5 is "low power DDR5". The difference between that and ordinary DDR5 isn't that it's faster, it's that it runs at a lower voltage to save power in battery-operated devices. DDR5-6400 DIMMs were available for desktop systems around the same time as Apple. Servers are more conservative about timings for reliability reasons, the same as they use ECC memory and Apple doesn't. Moreover, while Apple was soldering their memory, Dell was shipping systems using CAMM with LPDDR5 that isn't soldered, and there are now systems from multiple vendors with CAMM2 and LPDDR5X.
Existing servers typically have 12 channels per socket, but they also have two DIMMs per channel, so you could double the number of channels per socket without taking up any more space for slots. You could also use CAMM which takes up less space.
They don't currently use more than 12 channels per socket even though they could because that's enough to not be a constraint for most common workloads, more channels increase costs, and people with workloads that need more can get systems with more sockets. Apple only uses more because they're using the same memory for the GPU and that is often constrained by memory bandwidth.
> Existing servers typically have 12 channels per socket, but they also have two DIMMs per channel, so you could double the number of channels per socket without taking up any more space for slots. You could also use CAMM which takes up less space.
Usually this comes at a pretty sizable hit to MHz available. For example STH notes that their Zen5 ASRock Rack EPYC4000D4U goes from DDR5-5600 down to DDR5-3600 with the second slot populated, a 35% drop in throughput.
https://www.servethehome.com/amd-epyc-4005-grado-is-great-an...
It comes with a drop in performance because there are then two sticks on the same channel. Having the same number of slots and twice as many channels would be a way around that.
(It's also because of servers being ultra-cautious again. The desktops say the same thing in the manual but then don't enforce it in the BIOS and people run two sticks per channel at the full speed all over the place.)
That's 300GB/s slower than my old Mac Studio (M1 Ultra). Memory speeds in 2025 remain thouroughly unimpressive outside of high-end GPUs and fully integrated systems.