How to Build an Adventure Voice Game on Alexa: 5 Things Game Developers Should Consider


Voice-first gaming experiences have captured the hearts of customers, enabling them to enjoy interactive adventures, play along with their favorite game shows, and enjoy an Alexa-enabled board game. These game categories barely scratch the surface of what’s possible with voice, presenting unique opportunities for developers. The gaming experience on Alexa is diverse, giving developers a variety of options to build with Alexa depending on their interest, backgrounds, and existing games.

One game category that I’m especially excited about is adventure games. It’s exciting to think about how game developers can leverage voice to create immersive experiences that transform players to a new world or interactive scenario and fire up their imaginations. And many developers are already embracing the power and simplicity of voice to create a natural interface for adventure games.

Today I share five considerations for any developer looking to start building adventure games on Alexa, including voice game mechanics, Alexa’s role in the gameplay, and how to plan for future voice game expansions.

1. What story will your voice game tell?

An adventure, whether pre-planned or not, is always a story. For an adventure game to be captivating, the story needs to be as well. Ultimately, the goal is to maintain the player focused on the game itself. Some games achieve this through the use of flashy graphics, devilishly addicting mechanics, or clever progressions. When it comes to an Alexa game skill, while it can definitely have some of those components, the real focus for an engaging adventure game should be the story.

There is no magic formula for a solid story, although there are established best practices when it comes to story and character development. Apart from absorbing as much theoretical knowledge on good structure as possible, I recommend you take inspiration from your favorite games, books, and movies. See what works, see what doesn’t, and don’t be afraid to experiment.

However, the story you tell may be influenced by what role Alexa plays in the game itself, and the game mechanics that you will choose. Let’s look at these in order.

2. What is Alexa’s role in the voice game?

What role will Alexa play in your game? A helpful companion? A competing player? The omniscient Narrator? This is a fundamental consideration as it influences what storyline will take shape, and how the user will interact with Alexa and the game itself. Below are some ideas on how to consider Alexa as part of your narrative.


For example, it is very common in choose-your-own-adventure games such as’s Star Commander that Alexa play the role of a ship assistant, therefore contextualizing and justifying the assistant itself! From a gaming perspective, this will introduce a co-operative style of play, and most of the story will need to be developed around this mechanic. Of course you are free to stray from this and have the assistant be a key part of the narrative (i.e. the ship assistant turns out to be the enemy, a la GLaDOS). This style is great for immersion, but will present challenges further on in the story as it will be take some creative writing to get across rules/progression/help without breaking character or the story.


In other games, such as Universal Studio’s Jurassic World, the entire interactive story is pre-recorded by voice actors, thereby not requiring Alexa to assume any position in the game. If you have the time, know-how or resources to get professional voice actors, it adds another level of immersion in the game. Other examples to name a couple are Runescape Quests – One Piercing Note and The Unfortunates.

You can achieve the same effect by using an Amazon Polly voice to give your characters some variety from the native Alexa voice. It’s as easy as using an SSML tag such as <voice name=”Brian”>Hello, my name is Brian</voice> where the name attribute is the name of the Polly voice you wish to use. You can find a list of SSML-compatible Polly voices here. Keep in mind you can also give your characters accents by using a non-English speaking Polly voice and making use of the lang tag. For example, if we wanted to have a character in our skill speak English with an Italian accent, we would do something like this: <voice name=”Giorgio”><lang xml:lang=”en-US”>Hello, my name is Giorgio.</lang></voice> . This gives your game a much wider palette of voices!


In other games like Fangtastico’s Dungeon Adventure, Alexa plays the role of the narrator describing to the player what is happening around them, how the world works, and acting as an impartial mediator in the story. This is setup offers more flexibility as Alexa will then have total freedom to explain things as her role in the game is omniscient. For immersive adventure games where the user input is minimal, you may want to consider using Alexa (or another voice) as a narrator, and depict a richer scene than what could be depicted only through character dialogue.

3. How will players interact with your voice game?

Another interesting point to consider from the get-go is the type of interaction pattern you want. Will the gameplay be based around a fixed list of instructions? For example: “go north/south/east/west,” “move up/down/left/right,” or even just “yes/no” like in Yes Sire. Alternatively, will it allow for more free-form, contextualized input? For example: “You are now in a dungeon, you see a treasure chest, a door, and a strange-looking lever. What do you want to do?” → “take a look at the lever.”

The advantage of fixed-form interactions is predictability, reliability, and in some cases, lower cognitive load for the player. Yes Sire mentioned above takes this to the extreme and only allows Yes/No answers as you play the role of a King making decisions for your kingdom as new events come up. If you are to implement a fixed response structure yourself, the burden of “entertainment” is even more on the story, as you’ll want to keep the interactions to a minimum to prevent boring players (e.g. having to say “go north” seventeen times in a row).

The main advantage of free-form interactions is the freedom of exploration and the personalized experience, but it will require some more complex handling on the back end as players are more likely to say things you don’t anticipate. Here are a few recently launched features, which we will cover in future gaming posts, that can help your skill handle more complex interactions:

  • Intent chaining: Intent chaining allows your skill code to start dialog management from any intent, including the LaunchRequest. This means you are now much more free to affect how your skill progresses regardless of what incoming utterance was said or what intent was triggered.
  • Auto-delegation: With auto-delegation, Alexa will fill in any missing required slot automatically and only then send your back end an IntentRequest. This abstracts away all the hard work of getting missing slots filled in.
  • Slot validation: Alexa can now automatically re-prompt your players who provide unacceptable slot values in delegated dialogs. You can catch bad slot values at the interaction model level, simplifying back end validation patterns.

4. What is your plan for content expansion?

A good game entertains a player. A great game keeps a player coming back for more. Like with any other game, you must plan ahead and factor in your roadmap the addition of fresh content. While it’s easy to say something like, “let’s launch with one level, and we’ll end up adding more further on,” it’s much easier to factor expansion in from the beginning by asking your self tough questions such as:

  • How will adding more levels further on affect my back-end resources and how I manage them? The answer should be: make your code ready now and treat any starting content as an expansion, or module, right from the start.
  • How easy is it to make tweaks to the leveling, scores, and general game parameters? The answer should be: quite easy, you’ve extracted all of your game metadata outside of your skill handling logic, making it easier to tweak without touching live code.
  • What’s my expansion release timeline? The answer should be: you commit to adding fresh content every specified number of days/weeks/months. Make it clear in your skill description and make it clear in your game experience so as to manage user expectations.
  • What is my plan for internationalization? The answer should be: you are following best practices right from the start, making it a lot easier to add content or additional languages without necessarily changing your actual logic. In some games, language and game content are intrinsically intertwined, so plan ahead as how to manage them.

5. How will you expand your voice game with premium content?

When developers and content creators build delightful skills with compelling content, customers win. With in-skill purchasing (ISP), you can sell premium content to enrich your Alexa skill experience. Let’s face it, building a good game is hard work. A lot of time goes into making the experience smooth, the story engaging, and the content fresh. While it’s fine to keep the game completely free if this is a hobby side project, you should also consider the possibility of adding premium content to your skill, especially if your dream is to focus on building games full time!

ISP offers three levers to offer premium content. The first one is one-time purchases. These are meant to unlock access to content in your game in perpetuity. For example, you can offer one-time purchases as extra levels, maps, or characters.

The second type of in-skill product is a subscription that you can use to offer players access to premium content (e.g. level packs, special maps, upgrades) for the duration of the customer’s subscription. Note, these two mechanisms can be offered in the same skill. For example, a one-time-purchase can offer access to a specific level pack, whereas a subscription could offer access to every level, as long as the subscription is active.

The third lever is consumables. Consumable in-skill products can be purchased, depleted, and purchased again. For example, if the game has upgrades that expire (e.g. a health boost, a shield upgrade, a teleportation cartridge) they can me used up and purchased again in the future.

To make it even easier to add ISP, we recently announced the ability to create in-skill products from the Alexa Developer Console with updated documentation that guides you through the process. Check out our sample skill that has the whole flow implemented. Alternatively, if you prefer to write your own code, you can add and maintain ISPs using the Alexa Skills Kit Command-Line Interface (CLI). See here for implementation of the sample skill using the CLI. We recently published a blog post that dives deep into this sample skill, and walks you through how to support the different outcomes of a customer engaging with your premium content. My recommendation is to add at least one in-skill product when building your game, even if it’s a simple placeholder. That way your back end is already structured for handling ISP once you decide to roll it out.


These five considerations are meant to help you think about how to structure and plan ahead when embarking on the journey of building an adventure game. In future posts, we’ll dive into the more technical aspects of building a voice game. Stay tuned! In the meantime, if you want to continue the conversation, feel free to reach out on Twitter @muttonia.

Related Content

* This article was originally published here


Please enter your comment!
Please enter your name here