How Apple Silicon actually works.
Unified memory, performance and efficiency cores, the Neural Engine, the media engine. An honest walk through what Apple's chips do differently — and why it matters for the kind of work we do.
Three and a half years into Apple Silicon, a lot of people on our team still describe it as 'fast Intel'. It isn't. The architecture is genuinely different from anything else in the laptop market, and the differences explain why some workloads fly and some still struggle.
Unified memory
On a normal laptop, the CPU has its memory and the GPU has its memory. Sending data between them is a slow copy across a bus. On Apple Silicon, there's one pool of memory and both the CPU and GPU see it. No copying. No marshalling. The GPU can read what the CPU just wrote, instantly.
This is why M-series Macs are great at workloads that toggle between CPU and GPU repeatedly — video processing, certain ML pipelines, real-time rendering with computed inputs. It's also why memory capacity matters more than memory speed: you can't add a discrete GPU with its own VRAM later, so the unified pool has to be big enough on day one.
The blocks of an Apple Silicon SoC
| Block | What it is | When it earns its keep |
|---|---|---|
| Performance cores | Fast, power-hungry CPU cores | Build tools, hot single-threaded paths |
| Efficiency cores | Slower, low-power CPU cores | Background tasks, sleep-state workloads |
| Integrated GPU | Shares unified memory with CPU | Compositing, OpenGL/Metal, Final Cut renders |
| Neural Engine | 16-core ML accelerator | On-device CV, voice, computational photography |
| Media engine | Dedicated video encode/decode silicon | H.264, HEVC, ProRes — at near-zero CPU cost |
| Secure Enclave | Separate processor for keys + biometrics | Face ID, Touch ID, Keychain |
Performance and efficiency cores
Apple's CPUs have two kinds of cores — Performance (fast, power-hungry) and Efficiency (slow, sips power). The OS scheduler decides which workload goes where. Background tasks land on E-cores, foreground stuff lands on P-cores. The user mostly never notices, except that the battery lasts.
For developers this has a quiet implication: 'cores' on an M2 Max are not directly comparable to 'cores' on a Xeon. Single-threaded performance per P-core is among the highest in the industry. But if your workload only spawns one thread, all those extra cores are sitting idle.
The Neural Engine
A dedicated 16-core ML accelerator that runs trained models at very low power. It's invisible most of the time — but when you use Photos search, FaceTime portrait blur, dictation, or any Core ML model in an app, the Neural Engine is doing the work, not the CPU or GPU.
For our work, the Neural Engine is mostly relevant for on-device CV pipelines. A real-time object detection model that would melt a laptop CPU runs at 30fps with the fans off on a Neural Engine.
The media engine
Dedicated silicon for encoding and decoding video. H.264, HEVC, ProRes — handled by purpose-built hardware, not the GPU. This is why Final Cut Pro runs so much faster on Apple Silicon than Premiere does (Adobe is now catching up): the media engine is doing work that the GPU was forced to handle on Intel Macs.
Apple Silicon isn't 'an ARM laptop chip'. It's a deliberately heterogenous SoC where the OS, the apps, and the silicon all assume each other.
What it means for the work we do
- 01Anything memory-bandwidth-bound (build tools, large compiles) is dramatically faster.
- 02Video and audio workflows run on the media engine, leaving CPU/GPU free.
- 03On-device CV and ML — installation work, retail analytics — runs cleanly on the Neural Engine and doesn't need a separate GPU.
- 04x86-only workloads still translate well via Rosetta 2 but you're not getting full architecture benefits.
The takeaway
Apple Silicon isn't 'an ARM laptop chip'. It's a deliberately heterogenous SoC where the OS, the apps, and the silicon all assume each other. The further you push into Apple's own software stack (Final Cut, Logic, Xcode, Core ML), the more the architecture rewards you. The further you stay in cross-platform tooling (Docker, Linux VMs, x86 apps), the less of the benefit you actually see.