‘Envision Glasses’ – AI-Powered Smart Glasses

Envision Glasses are a combination of the Google Glass2 and Envision’s award-winning AI technology. The glasses enables users to read all kinds of information by using voice commands or various swipe and tap gestures on the touchpad located on the right hand side of the glasses.

The glasses use artificial intelligence to understand the world around you and speak the visual information back to you.

Lightweight glasses, with a camera and direct speaker, Envision Glasses speak out text and environmental information, recognizes faces, light, and colors, and lets you share that information.

Unobtrusive, intuitive design, Envision Glasses excel in all kinds of text recognition, even handwriting, in over 60 languages.


Envision’s software uses artificial intelligence to extract information from images
and then speaks the images out loud so the user has a greater understanding of
the environment around him or her. Blind and low-vision users can ‘see’ to read
documents at home or work, view labels while shopping, easily recognize their
friends, find personal belongings at home, use public transport, video call anyone in
real time, and more.

There are three key features of the Envision Glasses that stand out as highlights of
the glasses
Text Recognition:
This is the most important and most used feature of the Envision Glasses by our
users. It enables users to take images of text around them and have it spoken out
to them. Some things that set this feature apart are:
● Different modes for instant reading (for short pieces of texts), scan reading
(for longer and dense texts) and batch scan (for reading multiple pages at
● Ability to read text in over 60 languages (including non-Latin scripts like
Arabic, Japanese, etc.) with Automatic Language Detection.
● Ability to export the text to your phone for reading later, editing or sharing to
other apps.

Video Calling:

This is also a highly used feature and an offering that is very unique to Envision
Glasses. This enables users to make a video call directly from the glasses to a friend
or family member, whenever they need to ask for assistance that the current AI
tools can’t help with. The person on the other end of the call sees everything from
the perspective of the user and can offer audio guidance remotely. We are
discovering new and exciting use cases of this everyday.
App Platform:
The other big thing that sets apart the Envision Glasses is that it is essentially built
as a platform. This means any third party Android based application, which offers
services complimentary to Envision’s can be very easily integrated into the glasses.
We recently integrated the currency recognition app CashReader as an example.
We are in talks with several other apps who provide different value to the blind and
visually impaired users, ranging from recognition to navigation.


These are just the highlights. In addition to them, Envision Glasses has an
impressive list of constantly increasing and improving features. You can read about
all the features here:
This category contains all the features that enable users to read all kinds of text.
Instant Text
Enables user to read short pieces of text. This features works with a video feed,
where the camera constantly detects the text that is in the camera frame and
speaks it out. This works both offline (for languages with latin based scripts) and

online (for all languages). This feature is ideal for reading room numbers, station
signs, book covers, etc.
Scan Text
Enables users to read dense and complex pieces of text. This features works with
taking a picture, which is then processed and the output is provided in paragraphs.
Users are provided with audio cues indicating how much text is visible in the frame,
which helps with orientation.
Once the output is ready, users can easily scroll through the paragraph at their
convenience and speed, and only read parts of the text that they are interested in.
They can also export this text to the Envision App on their phone, where they can
read it again or copy it to other mediums. This is ideal for reading official
documents, letters, menu cards, etc.
Batch Scan
Enables users to read multiple pages of text at once. This features works in a similar
fashion as Scan Text but allows users to take multiple images consecutively. The
output of the text can be read with ease by scrolling through them or exported to
the phone. This is ideal for reading large documents, magazines, books, etc.

Call an Ally
Enables users to make a video call from the glasses with a friend or family member.
Users can add a list of their close friends and family members as an Ally through
the app. The friends and family can then receive the video call through a free
companion app called Envision Ally, just as they would receive a FaceTime or
WhatsApp call. This is ideal for situations where the AI functions are not entirely
helpful, like venturing into a new location or shopping at a mall.


This category contains all the general recognition and identification features of Envision.
Describe Scene
Enables users to get a simple description of what is around them. This feature
works by taking a picture with the glasses, which is then processed and a
description is spoken out. The descriptions are not always 100% accurate, but
sufficient to get an understanding of what’s going on in front of you. It’s ideal for
exploring new places and environments.
Detect Colors
Enables users to detect the color of what is in front of them. It simply speaks out
the color of whatever is in the center of the camera frame. It’s ideal for picking out
the right clothing or separating your laundry.
Recognise Cash
Enables users to recognise banknotes that they hold up. It works by detecting when
banknotes are in the frame and immediately speaks their value out loud. By
default, the currency of your region is selected. However, users can choose to
recognise over 100 different currencies, which are handy for when you’re travelling.
It’s ideal for when you’re out shopping and need to either pay or accept banknotes.

This category contains all the features that helps people with scanning and finding stuff
around them.
Find People
Enables users to detect if people are around them and also recognise your friends,
colleagues and family. The features works like a scanner where a video feed starts
to look for any faces in the camera frame. It gives an audio beep when a face is
detected and speaks out the name when it recognises it. For Envision Glasses to
recognise faces, they need to be taught using the Teach Faces function in the
Envision App. This is idle for finding friends or colleagues in a public space.

Find Objects
Enables user to scan and find specific objects in their surrounding. This feature has
a list of pre-trained objects that users can scroll through and select. Once an object
has been selected, the glasses will provide an audio beep as soon as it detects that
object in the camera frame. This can be used to look for objects such as coffee
cups, trash cans, etc.
Enables users to know in real-time what objects or people are around them. This
feature is a combination of the Find People and Find Objects, but instead of looking
for specific objects, it just speaks out everything that comes in the camera frame.
This is useful for discovering new places or exploring an environment.