Microsoft bets on on-device AI with Copilot+ PCs

Microsoft launched Copilot+ PCs on June 18th, marking the debut of Windows machines with a mandatory 40 TOPS neural processing unit requirement. These machines need a Qualcomm Snapdragon X Elite or Intel/AMD equivalents with a dedicated NPU to run AI features that set Copilot+ apart from regular Windows 11 machines.

A neural processing unit is designed specifically for matrix multiplication required by neural networks. While a CPU can perform the same computation, it uses more power and takes longer. A GPU is faster but is geared towards parallel workloads and consumes significant power. In contrast, an NPU is optimized for low-power inference, making always-on AI features feasible on a laptop battery.

The 40 TOPS requirement means the NPU can handle 40 trillion operations per second. For perspective, Microsoft's Phi-3 Mini, a 3.8 billion parameter language model, can run in real-time on 40 TOPS hardware. This capability supports co-pilots, live captions, image generation, and smart features without needing a cloud connection.

To achieve this level of performance, Microsoft worked closely with chip manufacturers to ensure that the NPU is integrated into the system-on-chip design. For example, the Qualcomm Snapdragon X Elite has a dedicated NPU that can handle up to 45 TOPS, exceeding the 40 TOPS requirement. This close collaboration allowed Microsoft to optimize the software stack to take full advantage of the NPU's capabilities.

The Recall feature, a highlight of Copilot+, was pulled from launch after security researchers showed that its local database of screenshots was accessible to any process running as the current user, without encryption. Microsoft delayed Recall to address the security architecture and plans to release it later this year as an opt-in feature with encryption and biometric protection.

In terms of power consumption, the NPU in Copilot+ PCs is designed to be highly efficient. For instance, running the Phi-3 Mini model on a 40 TOPS NPU consumes around 2-3 watts of power, compared to 10-15 watts for a GPU-accelerated model. This efficiency is crucial for battery-powered devices, enabling users to experience AI-powered features without significant battery drain.

The controversy surrounding Recall overshadowed the rest of the Copilot+ capabilities, which are shipping and functional. These include live captions with real-time translation, Cocreator in Paint for AI image generation, and enhanced Windows Search with semantic understanding. The underlying hardware platform appears sound.

Microsoft is making a bet similar to Apple's with Apple Intelligence and Google with Tensor chips: that the next generation of computing features will run locally, not in the cloud. Cloud AI is expensive at scale and introduces latency and privacy concerns that on-device models avoid. By getting NPUs into mainstream PCs now, Microsoft ensures the hardware is in place when the software ecosystem matures.

For enterprise developers and IT administrators, Copilot+ PCs introduce a new hardware tier to manage and a new set of Group Policy controls for AI features. Understanding these changes is crucial before the hardware refresh cycle begins.

The introduction of Copilot+ PCs and their AI capabilities also raises questions about how these features will be integrated into existing workflows and how they will be managed within enterprise environments. This will likely be an area of focus for Microsoft and its partners in the coming months.