How Screen-Reader Users Type on and Control Mobile Devices By Tanner Kohler  
Tuesday, July 25, 2023, 04:16 AM
“I can do everything you do as a sighted person, but it takes me a little bit longer and it takes me an alternative way of finding it and getting to it. […] The biggest thing we try to help blind people understand is [that] they can do it, but it's a very different way of doing it.”

— Screen-reader user who is completely blind

A major goal of UX designers is to make things easy for users. For the most part, we’ve made a lot of progress. However, we still have a lot to do to make things easy for people who depend on assistive technology. Users who rely on screen readers to use smartphones have learned that almost anything they want to accomplish will have an enormous interaction cost. Unlike many sighted users who quickly give up or move on when things take extra effort, many screen-reader users have accepted that everything inevitably takes extra time and patience.

Learning to use a smartphone is much harder for individuals who are blind or have low vision than for individuals with full vision. There are three major reasons for this extra difficulty:

Technology assumes users can see. Humans rely on sight more than any other sense. Visual information is extremely rich and detailed compared to what people hear, smell, taste, or touch. Thus, it’s no surprise that most technology has been designed around sight. The largest drain on a smartphone battery is the enormous, bright screen — not the speakers or the haptic feedback.

Many visual impairments come later in life when it’s harder to learn new skills. While there are countless people who have been blind since birth, many screen-reader users have lost sight later in life — some suddenly and some gradually. The longer a person lives, the more their brain plasticity decreases, making it more and more difficult to learn new skills.
Users with visual impairments must teach themselves. In most cases, screen-reader users must figure out how it all works on their own. They do not have the benefit of constantly observing how others use technology — as many young, sighted users do. One elderly study participant who is gradually losing his sight said, “you get [a smartphone] and you really have to [...] train yourself, and sometimes it takes forever unless you're in a situation where you can sit down and have training.”

The goal of this article is to shed light on how screen-reader users accomplish some of the most basic actions on smartphones: typing and navigating. While perfect solutions to these challenges do not yet exist, the first step in the design-thinking process is to empathize. We hope to help you deepen your empathy with these insights.

Our Research
To better understand the experience of using mobile screen readers, we recently conducted qualitative usability tests and contextual-inquiry sessions with participants who had varying levels of sight — including some who were fully blind. We visited these people in their homes or personal offices and gave them tasks to perform on their own mobile devices. The following are some insights that emerged from those sessions.

Typing on a Touchscreen Is Hard
To stay oriented in the world, users with visual impairments rely heavily on tactile information gathered through their hands. In fact, one of our study participants shared that he collects 3D models of buildings as he travels in order to feel them and get a sense of their structure. Multiple users in our study mentioned that they miss the days when smartphones had physical buttons for this reason.

This reliance on touch makes a computer the preferred tool for many online tasks because physical keyboards are easier to use than smartphones’ small touchscreen keyboards. Touchscreen keyboards are difficult to use because they lack reference points (other than the edges of the screen) to help users locate the keys they need. In contrast, on physical keyboards users can feel the individual keys as separate buttons which serve as landmarks to keep them oriented. Users are also able to leave their hands in one place and use specific fingers for designated keys. This is why any user (sighted or not) can learn to type very quickly, without looking, on a physical keyboard.

On a touchscreen, the only way for screen-reader users to know whether they have tapped the correct key is by hearing the letter spoken out loud by the screen reader.

One of our participants demonstrates what it is like to type on a digital keyboard after years of practice. (The participant graciously gave his consent for us to include his name in this recording.)

However, the affordances of mobile devices are still valuable enough that users and designers are constantly working to overcome the challenge of typing. Our study participants relied on four main methods for inputting information that allowed them to avoid the touchscreen keyboard:

Dictation
Phone-based voice assistants like Siri and Google Assistant
A braille display
A digital braille keyboard
Each method is described below, ordered by user preference.

Dictation
Screen-reader users almost always prefer dictating (speaking) information over typing it. This is because dictation has a significantly lower interaction cost than any of the available keyboard options, and it does not require braille literacy.

Android and iOS both provide dictation on their touchscreen keyboards.
Users most often go straight for the dictation button on a touchscreen keyboard, before even attempting to type. (Android, left and iOS, right)
The greatest challenge of dictation is that it still results in many mistakes. Moreover, it is difficult to check the transcript for accuracy and to correct mistakes. Even if a user moves the screen-reader focus back to the drafted text to listen and check for accuracy, it is hard to precisely place the cursor near a mistake and edit the text. Unless there is a lot of text already drafted, users will just delete everything and start over instead of attempting to edit the text.

Additionally, many languages contain homophones (words that sound the same but are spelled differently and have different meanings — for example: right, rite, wright, and write). Simply having the screen reader read back what was typed is not enough for a user to recognize when the wrong word has been transcribed. Unfortunately, homophones can lead to unnoticed typos in texts, emails, or internet searches — even when screen-reader users are double-checking. As a result, screen-reader users may have to deal with inaccurate search results, and others may unfairly judge them as illiterate.

The participant didn't notice he had made a typo.
This participant who is fully blind meant to type app into the search field to look for the App Store on his device. He accidentally typed only pp without realizing it and had to read 4 separate search results to figure out that he must have had a typo.

However, users in our study were willing to put up with the challenges of dictation to avoid the difficult process of typing on a touchscreen keyboard, which often took much longer and resulted in as many mistakes. This was especially true when they were writing longer passages such as an email.

Voice Assistants
Dictation is available only when the screen reader’s focus is positioned in an open-text field and the keyboard appears. In other situations, users often turn to a voice assistant like Siri or Google Assistant to complete basic tasks that involve typing so they can avoid opening and navigating through an app or website. For example, screen-reader users are likely to ask a voice assistant to send a text message to a contact (Hey Siri, send a message to [name of contact]) rather than opening the Messages app, finding the relevant conversation, getting the screen reader to focus in the open-text field, and hitting the dictation button.

Android and iOS provide similar voice assistant features, such as sending a message.
Screen-reader users prefer to use a voice assistant like Google Assistant (left) or Siri (right) to perform typing tasks such as sending a text or email.

When we asked study participants to send us an email during our sessions, several revealed that they had already created a contact with our information after only one email exchange. As one user stated, “Siri comes in real handy when you're doing things with text and email, as long as you have [the person] in your contacts. I guess if you don't, you’ve got a problem. You gotta put it in.” When questioned about having many contacts he responded, “[they’re] easy to delete.”

Braille Display
A braille display is a physical device that acts both as an input and an output channel for a device. Modern braille displays generally connect to computers, smartphones, or tablets via Bluetooth. These devices allow users to interact with an interface without a keyboard or mouse, by controlling the focus of their screen reader with physical buttons. Users can type in Braille using the 8 physical braille keys. These keys give them more precise control than the touchscreen keyboard or dictation, because of the physical reference points. The braille display acts also as an output channel: as the screen reader’s focus moves across the screen, the braille display ‘translates’ the words on the screen into the braille alphabet and presents them to the user by controlling a set of mechanical braille pins.

A braille display.
One participant’s Bluetooth braille display

A participant typing on a braille display
Controlling the device and typing on the braille display with the 8 buttons

A participant reading on-screen text using the braille display pins
Reading on the braille display with the mechanical pins

Braille displays give screen-reader users a way to silently interact with devices: users can turn off the audible screen-reader output and read what it is telling them with their fingers on the braille pins. Even though these devices are most commonly used with computers, they can also control mobile devices such as smartphones, as multiple participants in our study demonstrated. Users find them particularly helpful when they are in quiet or professional environments, or when they need to do a lot of typing.

Silent example of one participant in our study navigating his smartphone by swiping and reading the output of the screen reader on the white braille pins.
However, a braille display, which is about the same size as a small Bluetooth computer keyboard, is inconvenient to carry around. This is another reason why screen-reader users rely so heavily on dictation for typing on a mobile device.

Additionally, because braille displays can easily cost more than a smartphone or computer, not all users who might like to use one will own one. Moreover, many blind or low-vision individuals are not familiar with the braille alphabet. Designers should not assume that users will rely on these devices and should create designs that are easy to use without them.

Digital Braille Keyboard
Users who can type in Braille can also enable the on-screen braille keyboard on their smartphones. This keyboard consists of numbered dots that mimic the buttons on a physical braille display. Users are most likely to utilize this input method when they want to type quickly and be more precise than dictation will allow, but still want to avoid the traditional touchscreen keyboard.

Six digital braille keys displayed in two horizontal lines.
Upright mode. The user holds the phone with two hands, with the screen facing away from their body. They curl their fingers so the tips of three fingers from one hand rest on the dots on one side and three fingers from the other hand rest on the dots on the other side (Android TalkBack).

A user typing on a digital braille keyboard in "upright mode"
Upright mode on iOS
A digital braille keyboard displaying six buttons
Tabletop mode on Android. The user lays the phone in front of them with the screen facing up and, like in the upright mode, places three fingers from one hand on the dots on one side and three fingers from the other hand on the dots on the other side.
A user typing on a digital braille keyboard with the device sitting on a table
Tabletop mode on iOS

One study participant using the tabletop mode and upright mode while typing on a digital braille keyboard

Users can rely on their spatial memory while using a physical or digital braille keyboard because there is a designated finger for each button. The user does not need to be very precise because the tap will register if the finger lands close to the button. The screen reader announces which letters or symbols have been typed as the user presses combinations of the on-screen keys; this is the only way for users can know in real time whether they have spelled something correctly. However, this method of typing takes over the entire screen and requires the user to use both hands and completely change how they hold the device, which is a lot of work. Hence, another reason why dictation is the preferred method for most text input.

Commonly Used Gestures for Controlling the Screen Reader on a Mobile Device
It’s easier to use a screen reader with a computer than with a smartphone — as many of the participants in our study acknowledged. While the mouse is mainly useless when using a screen reader, a physical keyboard gives the user a lot more power than a touchscreen does. Keyboards afford hundreds of custom commands (i.e., accelerators), that enable users to complete direct actions without having to navigate through an interface to find the page element corresponding to that action. In many cases, a keyboard allows screen-reader users to break out of the suffocating sequence of the code to directly access what they want. For example, a user could use the custom keyboard command associated with sending an email rather than having to navigate through the interface to find the Send button.

However, on a touchscreen device users must use gestures to control their screen readers. While screen readers make use of many unique gestures, there are far fewer unique combinations than there are ways to combine keys on a physical keyboard for direct commands. Unfortunately, the many touchscreen gestures can be hard to remember — particularly for new screen-reader users. Additionally, screen-reader gestures do not have any specific signifiers and are not related in any way to the actions they stand for.

Various operating systems provide comprehensive documentation of the gestures available to control screen readers on mobile devices (for example, VoiceOver on iOS and TalkBack on Android), and many allow for some limited customizations. Here we present some insight into which actions users found most useful, and, in some cases, the gestures that trigger these actions. Familiarity with this vocabulary of gestures can help you anticipate some of the expectations screen-reader users bring with them when they open your website or app on their phones.

Swiping Left and Right
Swiping is the most basic way in which users explore designs. Once the screen reader’s focus lands on something that has a strong information scent, the user can double-tap to select it. Swiping perfectly embodies the sequential nature of a screen reader. Any design can become more accessible if the designers ensure that what users come across as they swipe is clear and that the sequence in which elements are read makes sense.

Dragging
Dragging a finger across the screen will cause the screen reader to announce everything along the finger’s path. When users have an idea of where something is on the screen, they sometimes drag their finger in that direction.

Dragging breaks the sequential organization in the code and gives users direct access. Moreover, with dragging, the size and the location of the various page elements along the dragging path matter (as predicted by Fitts’s law), with bigger and closer elements being easier to acquire. (In contrast, when swiping sequentially through the sequence of the code, the size and the location of a page element do not make any difference for a screen-reader user.)

We saw all kinds of users drag rather than swipe:

Novice screen-reader users who were overwhelmed by the task of swiping through many page elements

Partially sighted users who had a vague sense of what was displayed on the screen
Expert screen-reader users who were looking for something specific that they knew was there but couldn’t find by swiping

Tapping Directly
When users know where something is located on the screen (because they have accessed it many times in the past), they might tap on that part of the screen to directly access it. (Note that the tapping action will cause the screen reader to read that element rather than to select it. To select it, users would have to double-tap.) However, simply experiencing the page through a screen reader will not teach the user where something is on the screen. Screen-reader users might learn where something is and start tapping it directly in the following cases:

A sighted person has shown them where to tap.
An item (such as a search bar) comes up first in the sequence and they guess it is located near the top left of the screen.
The user has customized the location of an element (such as apps on the phone homescreen).
The user has learned where something is by dragging or just tapping around.
Continuous Progression
Users often have the screen reader continuously read all page elements so that they don’t need to constantly swipe to move through a page. This is most common on content-heavy pages (such as articles), or when the user wants to explore everything on a new page. When users enable this continuous flow, they can sit back and simply listen for a while.

Continuous progression is a slower exploration method than swiping through every page element because swiping enables the user to skip ahead if something seems unrelated to their task. Most screen readers allow users to begin this continuous progression from the current location of the focus or from the beginning (top left) of a page. Users can stop this continuous progression at any time by tapping with two fingers.

Stopping the Screen Reader
Users can stop the screen reader at any time. This is very important if they suddenly need to silence their phone. During our sessions, users frequently silenced the screen reader to focus on the conversation they were having with another person. This is particularly important for the think-aloud method employed in usability-testing sessions because the participant doesn’t have to compete with the screen reader to share insights.

When users cut the screen reader off mid-sentence it is often because they have shifted their attention to something else. In some cases, they will purposefully pick things up right where they left off. But in other cases, they will have forgotten exactly what they were hearing by the time they come back.

Screen-Reader Controls
Because the number of convenient gestures (everything from swiping with one finger to triple-tapping with four fingers) is limited, screen readers allow users to re-purpose these common gestures as well as other characteristics (such as speaking speed, or whether they can swipe between headings or links) on the go by changing the mode the screen reader is in. These modes can be accessed through a menu that is usually available at any time when the user is using a screen reader. iOS calls this menu the rotor and displays it when the user has made a twisting motion with two fingers anywhere on the screen at any time. Android calls it the reading controls, activated by swiping down and right in an “L” shape.

Conclusion
Typing is a difficult task for screen-reader users, so they generally prefer to dictate whenever possible. Designers should not assume that screen-reader users will make use of the same touchscreen keyboard that sighted users rely on — that is, in fact, their least favorite input method. Controlling a screen reader on a smartphone is more difficult than on a computer, but users still learn to do it because mobile devices offer so many benefits in their lives.

Comments

Add Comment
Fill out the form below to add your own comments.









Insert Special:
:o) :0l







Moderation is turned on for this blog. Your comment will require the administrators approval before it will be visible.