Apple continues to push the limits of what its machine learning technology can do. One of the less publicized new features in iOS 14 goes to show just how useful artificial intelligence can actually be in improving accessibility for users with impaired vision.
While it’s easily understandable how this could have been missed in all of the other major features that Apple has delivered in iOS 14, it turns out that Apple has expanded its VoiceOver accessibility feature to provide narrated descriptions of images.
In other words, your iPhone or iPad can now describe to you what’s in any image that you might otherwise be looking at, and it can do it with a surprising level of accuracy.
In fact, it’s quite amazing when you consider that not only is the entire thing being done by machine learning, but it’s being accomplished entirely on the iPhone or iPad itself, using the power of Apple’s own A-series chips.
Unlike most other photo analysis services, Apple doesn’t analyze your photos in the cloud at all. In fact, it’s been able to handle face and object detection in the Photos app since iOS 10 was released back in 2016. It was doing it entirely on the device’s A10 chips even back then, a year before introducing the Neural Engine component in the A11.
So it’s probably not a big surprise that more advanced chips can do a lot more. A user on Reddit (via iPhone in Canada) highlighted the new accessibility feature who said it “blew their mind” how accurate it was. It’s not yet clear whether a minimum iPhone model is required to use the feature, but the thread on Reddit suggests that it might require an A12-equipped iPhone XS or better, which would seem to make sense since the analysis is clearly being done in real-time.
We tried it out on some of our own photos and found it similarly impressive, offering up descriptions such as the following:
“A clear container with a variety of fruits and eggs on a brown surface.”
“A plate of meat and vegetables next to a glass of wine.”
“A burger and salad on a white plate next to a glass of water on a wooden surface.”
A train on a track at night.
Each photo also began with the time, date, and location of where it was taken. If a known face was detected in the photo, VoiceOver would add “Maybe…” followed by the name of the person.
Other descriptions included these type of things: A child holding a knife and sitting at a table with a cake on it, A group of people playing musical instruments on a stage, A person wearing glasses and posing for a photo in front of a dresser, Chopped vegetables in a pan on a stove, A child sitting on a woman’s lap that might be a hospital, and A person holding a baby and posing with another person on a grassy field.
This feature can also be used with any photo you’re looking at on the iPhone, not just in Apple’s own Photos app. For example, tapping any photo on Facebook with VoiceOver enabled will offer up the same sort of description, prefixed by “On Facebook.” This means that Apple isn’t using any kind of pre-analyzed data to provide these descriptions — it’s actually analyzing the photos in real-time.
How to Enable It
As long as you have a relatively recent iPhone model running iOS 14, you should be able to try this out for yourself just by enabling the feature and then looking at some photos yourself. Here’s how to do it:
- Open the iPhone Settings app.
- Tap Accessibility.
- Tap VoiceOver.
- Tap VoiceOver Recognition.
- Tap Image Descriptions.
- Tap switch to enable Image Descriptions.
- Wait for the content to finish downloading.
- Open a photo on your iPhone.
- Bring up Siri and say, “Turn on VoiceOver.”
- Tap on the photo to hear Siri’s description.
- Tell Siri to “Turn off VoiceOver” when you’re done.
While you can enable VoiceOver while you’re in the Accessibility settings, it’s easier to enable it on demand using Siri since it can be a nuisance to navigate with VoiceOver. Sadly, asking Siri to “describe a photo” when VoiceOver is disabled doesn’t actually work, but it doesn’t respond either, simply closing the photo instead.
Note that in the Image Descriptions settings, you can also specify what VoiceOver should do when encountering “sensitive content,” as well as disabling the feature in specific apps where you may not want it enabled.