Last week, Apple acquired a little-known Seattle startup, Xnor.ai, that specializes in “edge-based” artificial intelligence — meaning AI that occurs on users’ devices rather than up in big cloud computing clusters.
While Xnor.ai’s association with Wyze home security cameras — their AI technology powered Wyze’s relatively recent on-device people detection features — led to speculation that Apple had acquired them in order to improve HomeKit Secure Video, that may have been a somewhat superficial analysis, considering that Xnor.ai’s expertise runs much, much deeper than simple face detection.
In fact, Xnor.ai had actually made Forbes’ list of America’s 50 Most Promising Artificial Intelligence companies, and with both its expertise, and its edge AI technology so closely aligned with Apple’s privacy goals, it seems like an acquisition was all but inevitable.
Apple has already worked quite hard to put as much of its machine learning technology right on the silicon of its iOS devices as it can, with features like object and face recognition added in the iOS 10 Photos app back in 2016 that differentiated Apple’s privacy model from rivals like Google and Facebook by pushing photographic analysis down to its A-series chips rather than relying on cloud servers. When Apple introduced a “Neural Engine” in their A11 chips the following year, it seemed even more obvious that this was the way forward for the company.
The Missing Piece of Apple’s AI
However, there’s one area that Apple hasn’t been able to tackle in the on-device silicon, and that’s Siri. While the company makes a valiant effort to perform as much processing as it can on the device — the “Hey Siri” wake-up command, for example, doesn’t rely on Apple’s servers at all, even to factor in voice recognition — the actual Siri commands still require a round-trip to Apple’s servers.
Not only does this have potentially serious privacy implications, as we found out last summer, but it’s ridiculously inefficient from a user experience point of view. Even the simplest Siri commands like “Stop” can produce awkward pauses as Siri is forced to send the request up to Apple’s cloud just to figure out what the user is saying.
So as MacWorld’s Michael Simon points out, the acquisition of Xnor.ai may be the hope that we’ve all been waiting for to bring Siri’s smarts directly onto Apple’s devices, from the iPhone to the HomePod. As Simon explains, one of the magic pieces of Xnor.ai’s people detection technology was that it worked on a budget camera with privacy in mind, which is actually the company’s claim to fame — making machine learning algorithms so highly efficient that they can run on some of the lowest tiers of hardware.
In other words, while Apple has had to beef up its A-series security chips and even throw in a Neural Engine to handle much of its artificial intelligence, Xnor.ai’s technology could allow them to do far more with far less. In fact, if you consider the insane amount of power that’s expected to land on the A14 chip this year and then combine it with Xnor.ai’s capabilities, the sky could very well be the limit.
One of Apple’s challenges with Siri is that the voice assistant has to be ubiquitous, meaning it has to run on everything from the A8 chip in Apple’s HomePod to the A13 chip in the iPhone 11 Pro Max — and arguably on quite a few older devices too. While Apple could technically use edge AI for some of the voice assistant’s features only on newer iPhones that are equipped with Neural Engines, that would likely result in an extremely fragmented and inconsistent experience for users — something that Apple almost certainly wants to avoid.
However, since Xnor.ai could do sophisticated people detection on cameras as simple as Wyze’s, it’s not hard to imagine what the company’s technology could pull off on the relatively more well-equipped A8 chip in the HomePod, not to mention the more powerful chips that Apple is continuing to crank out year after year.
This could allow Apple to build a version of Siri that wouldn’t need to talk to the cloud at all — except of course when you actually ask it to, to look up information for example — and could even significantly improve voice recognition to a much higher degree of accuracy. After all, as Simon points out, Xnor.ai could tell the difference between people and pets on a bare-bones security camera, so it shouldn’t have a hard time differentiating voices on a powerful A-series CPU.
In fact, Simon contends that it’s Apple’s stance on privacy that has held them back from doing more with Siri, especially when compared to rivals like Google Assistant and Alexa, who have no such compunctions about collecting and processing user data in the cloud. However, if Siri’s voice recognition and even its requests could be handled completely on the device, without the need to relay requests through Apple’s servers, Siri could potentially become a whole lot smarter and more useful as an assistant, since it could monitor your actions and respond contextually without any risks to your privacy.