Wednesday, March 20, 2019

How to Make Accessible HTML5 Games that Work with Screen Readers, Part One

This is the first of a three-part series.  Links to the other two articles will be added to this page as the other parts become available.



This year, for the 2019 Seven Day Roguelike Game Jam, I decided to do something a little different.  Instead of focusing on new game mechanics, I focused on widening the audience for roguelikes by attempting to make a screen reader friendly roguelike.

The result is Battle Weary, an HTML5 roguelike that is playable with a screen reader.  (It has been tested with VoiceOver, and it appears to work with NVDA, and hopefully it will work with others.)

History

When I embarked on this project, I did a lot of searching online for examples of HTML5 games that work with VoiceOver, and for advice on how to technically create them, and I came up with very little advice or examples to learn from.  I was unable to find a single web-accessible game that seemed to work well with screen readers, and I found very little information or advice on making HTML5 games work with them, either.

I did find a rather small amount of advice for making web applications accessible, but it turns out that the vast, vast majority of the consideration of this approach is working under the assumption that what is being presented is, basically, an elaborate web form.

To a certain extent, one could consider an online HTML5 game to be a very complex web widget, so I tried writing a series of "hello world" type implementations of an interactive item that would respond to keyboard commands and act like a game, using the standard WAI-ARIA recommendations for creating web applications.

The result was, to say the least, disappointing.

In a traditional web application, the screen reader exposes affordances to help the user navigate from item to item, learn its role, and activate it. But the sheer verbosity of what results was very confusing.  Users accustomed to using screen readers would probably not find it so, but it still destroyed immersion.  A player would spend far more time navigating the UI than playing the game.

The larger problem, though, was that the core model of the exposure of web applications to the screen reader was that the screen reader scans the page to understand its structure, builds a parallel model of the structure of the UI from that, and then exposes that to the user.  It does this only once, at page load, and largely considers that content static and unchanging, offering the user the ability to step through it all linearly, learn about each control, and optionally activate it.

That's great for a web application page, but not so great for something like a game where the game is expected to change state, in very fundamental ways, with every player input.  There are a few affordances in the WAI-ARIA spec to accommodate changing page content (the "live regions"), but for the most part, I found that changing the game structure dramatically during play – even something as simple as switching between a "main menu" and the core gameplay – sowed confusion and errors, and often put the screen reader in a state that precluded sensible navigation.  I was getting very frustrated, and almost gave up on the effort.  It appeared that the WAI-ARIA spec just wasn't up to the task of exposing deep, complex, interactive games in any way that would be satisfying or understandable, let alone enjoyable.

The Turning Point

It was about this time that a fellow Twitter user pointed me to Quentin's Playroom.  Somehow, in all my searching, I had missed this site.  Quentin's Playroom is a site specifically made for blind players to play multiplayer games like Uno, Poker, Spades, Yahtzee, etc.  And it had been online for over a decade, with millions of gameplays under its belt.  Suddenly, I had something I could look at to learn from!  I immediately registered and played a game of Uno, and it was an eye-opener.


The entire game is essentially played in a single ARIA-live region, a DOM element that simply announces changes to its content.  While the player can use the screen reader affordances to navigate over to other menu items that let them choose options to play cards, draw cards, roll dice, etc., it was clear that the intent was for players to not do that.  Instead, they were expected to internalize a few keyboard commands and just use those to play the game.  The game would react to each keyboard command by adding commentary into the live region, which in turn would be spoken to yield the current game state.

Suddenly, the path forward seemed clear.  Instead of trying to dynamically expose all the game options of every game state as individual WAI-ARIA items that could be navigated between to understand the current game state and issue commands to change it, I could instead expose none of the game controls as targetable objects and instead react to keyboard commands directly and announce them, turning the game into, essentially, a conversation, much like how tabletop roleplaying games work.  The web application would serve as the "dungeon master", acting as the guide for the player, and the player could respond by making choices assigned to keyboard commands.  The "dungeon master" would then describe the results.  The user wouldn't have to navigate around to try to see the results of their activity; it would announce the activity for them.

It would require diligence on the part of the programmer.  Every game state would have to be able to make clear the key commands at every stage, and design considerations like consistency and clarity would be imperative because we'd now be responsible for all discoverability and usability of the page content.  But it would work, and it would be far more pleasant and enjoyable than navigating web forms, making it conducive to gameplay.

Before long, I had a reference "hello world" implementation that could be interacted with, and it felt natural, discoverable, and best of all, enjoyable.  The nut had been cracked!  In the following days, I was able to produce an entire roguelike game that was easily and comfortably playable with only a screen reader, and it didn't require untenable amounts of extra work to make it happen.

So this series of articles is going to talk about how to actually achieve this, in the hopes of encouraging other people to make their HTML5 games work with screen readers, and also talk about some of the things I learned about how to design games that lend themselves well to this approach.

Part Two will get into the nitty-gritty technical details of making this stuff work.

Part Three will talk about the design considerations and other "best practices" I've identified while doing this work.

An Important Caveat

I still consider most of this largely unproven.  It has not had the benefit of being hammered by thousands of screen reader users playing millions of games.  I'm still a novice at this stuff, so it's entirely possible there are deep, fundamental flaws with the way I'm doing this.  It could be that some screen reader users would prefer a series of non-interactive web pages to an interactive widget that commandeers the page activity and does its own, nonstandard thing.  And it could be fundamentally incompatible with some brands of screen readers it is as yet untested with.  I cannot vouch that this approach will work for everyone, nor can I vouch that this will satisfy legal requirements for accessibility standards (say, for entities that receive federal funding).

I have heard from a regular screen reader user that this approach worked well for them, so hopefully, the approach I outline in this article series is worthwhile and helpful and does, indeed, meet all these goals.  But even if it turns out that this approach is flawed for some reason, at least it should serve as a jumping-off point that will get us to where we need to go, because as it stands, it seems there isn't even really a conversation happening about making HTML5 games accessible.  At least I can help get that conversation going!

No comments:

Post a Comment