Microsoft Edge can now auto-generate image labels for Narrator and other screen readers
A picture is worth a thousand words. Microsoft is taking the old adage quite seriously, it has introduced a new feature in Edge, called Automatic Image Descriptions, to assist people with visual impairments.
Before we go into how the technology works, let us first take a look at how images are used by websites. When blogs publish articles, writers attach screenshots to the posts, and set an attribute for the image. The attribute is called alt text. This description acts as a caption or description, that is recognized by search engines. When a user searches for keywords that fit the image's alt text, the search engine will highlight the appropriate image among the results.
Automatic Image Descriptions in Microsoft Edge
Screen readers such as the Narrator in Windows 10 and 11 are commonly used by people with visual impairments. These software use text-to-speech algorithms to help them understand what is displayed on the screen, and selecting/executing various options, etc.
Microsoft Edge supports Narrator to read the text content on web pages, and assists users to navigate websites, links, etc. When the browser loads a page that contains images, the Narrator will check if the picture has an alt text assigned to it, and if it is, it will be read aloud.
According to Microsoft, many websites don't include an alt text for images. This means their descriptions are blank, and the screen reader skips it entirely, and the user will miss out on useful information that the picture could contain.
This is where the new Automatic Image Descriptions in Microsoft Edge comes into play. It combines optical recognition for images and with text-to-speech. When Microsoft Edge detects that an image does not have an alt text caption, it will send the media to its machine learning algorithm, which is powered by Azure Cognitive Service's Computer Vision API.
The artificial intelligence tech analyzes the content in the images, creates a description for it in one of the supported languages, and returns it to the browser, for the Narrator to read aloud. It is also capable of optical character recognition (OCR) to detect text inside images and supports 120 languages. Automatic Image Descriptions supports common image formats such as JPEG, GIF, PNG, WebP to name a few.
There are some exceptions which the Vision API will ignore, i.e. not read aloud to the reader. This includes images that the website sets as descriptive, images lesser than 50 x 50 pixels, very large pictures, and photos that may contain gory or adult content.
How to enable Automatic Image Descriptions in Microsoft Edge?
Enable Windows Narrator by using the hotkey Ctrl + Win + Enter, and the screen reader will read out the image descriptions for you as you browse the internet using Microsoft Edge. You can toggle the feature from the browser's context menu.
It is truly amazing how accessibility features like Live Captions, Color blindness filters can help people with disabilities. Automatic Image Descriptions are an excellent addition to the arsenal.Advertisement