4 February 2024·9 min read·

How Apple Silicon actually works.

Unified memory, performance and efficiency cores, the Neural Engine, the media engine. An honest walk through what Apple's chips do differently — and why it matters for the kind of work we do.

Three and a half years into Apple Silicon, a lot of people on our team still describe it as 'fast Intel'. It isn't. The architecture is genuinely different from anything else in the laptop market, and the differences explain why some workloads fly and some still struggle.

Unified memory

On a normal laptop, the CPU has its memory and the GPU has its memory. Sending data between them is a slow copy across a bus. On Apple Silicon, there's one pool of memory and both the CPU and GPU see it. No copying. No marshalling. The GPU can read what the CPU just wrote, instantly.

This is why M-series Macs are great at workloads that toggle between CPU and GPU repeatedly — video processing, certain ML pipelines, real-time rendering with computed inputs. It's also why memory capacity matters more than memory speed: you can't add a discrete GPU with its own VRAM later, so the unified pool has to be big enough on day one.

The blocks of an Apple Silicon SoC

Block	What it is	When it earns its keep
Performance cores	Fast, power-hungry CPU cores	Build tools, hot single-threaded paths
Efficiency cores	Slower, low-power CPU cores	Background tasks, sleep-state workloads
Integrated GPU	Shares unified memory with CPU	Compositing, OpenGL/Metal, Final Cut renders
Neural Engine	16-core ML accelerator	On-device CV, voice, computational photography
Media engine	Dedicated video encode/decode silicon	H.264, HEVC, ProRes — at near-zero CPU cost
Secure Enclave	Separate processor for keys + biometrics	Face ID, Touch ID, Keychain

The discrete blocks inside an Apple Silicon SoC.

Performance and efficiency cores

Apple's CPUs have two kinds of cores — Performance (fast, power-hungry) and Efficiency (slow, sips power). The OS scheduler decides which workload goes where. Background tasks land on E-cores, foreground stuff lands on P-cores. The user mostly never notices, except that the battery lasts.

For developers this has a quiet implication: 'cores' on an M2 Max are not directly comparable to 'cores' on a Xeon. Single-threaded performance per P-core is among the highest in the industry. But if your workload only spawns one thread, all those extra cores are sitting idle.

The Neural Engine

A dedicated 16-core ML accelerator that runs trained models at very low power. It's invisible most of the time — but when you use Photos search, FaceTime portrait blur, dictation, or any Core ML model in an app, the Neural Engine is doing the work, not the CPU or GPU.

For our work, the Neural Engine is mostly relevant for on-device CV pipelines. A real-time object detection model that would melt a laptop CPU runs at 30fps with the fans off on a Neural Engine.

The media engine

Dedicated silicon for encoding and decoding video. H.264, HEVC, ProRes — handled by purpose-built hardware, not the GPU. This is why Final Cut Pro runs so much faster on Apple Silicon than Premiere does (Adobe is now catching up): the media engine is doing work that the GPU was forced to handle on Intel Macs.

Apple Silicon isn't 'an ARM laptop chip'. It's a deliberately heterogenous SoC where the OS, the apps, and the silicon all assume each other.

What it means for the work we do

01Anything memory-bandwidth-bound (build tools, large compiles) is dramatically faster.
02Video and audio workflows run on the media engine, leaving CPU/GPU free.
03On-device CV and ML — installation work, retail analytics — runs cleanly on the Neural Engine and doesn't need a separate GPU.
04x86-only workloads still translate well via Rosetta 2 but you're not getting full architecture benefits.

The takeaway

Apple Silicon isn't 'an ARM laptop chip'. It's a deliberately heterogenous SoC where the OS, the apps, and the silicon all assume each other. The further you push into Apple's own software stack (Final Cut, Logic, Xcode, Core ML), the more the architecture rewards you. The further you stay in cross-platform tooling (Docker, Linux VMs, x86 apps), the less of the benefit you actually see.

Talk to Remiam about a system like this.

Open a line →Back to Notes