Tarot Time: turning my Google Assistant into a tarot reader in under a day

15 min readOct 26, 2018

Tips and tricks I learned creating my first Action for the Google Assistant playing around with Actions on Google, Dialogflow, and tarot cards.

For the last few years, me and many of my friends have become increasingly obsessed with tarot cards. You don’t have to believe in magic to still enjoy the process of drawing a card, pondering its meaning, and observing its themes throughout the day (even if that’s just because you’ve primed yourself to see them).

The problem with daily tarot draws is that I am also perhaps the most absent-minded person in the world. Left to my own devices, the likelihood that I will remember to have this nice tarot card ritual before I leave for work is approximately zero. But what if I could create a little familiar — okay, maybe just an Action — to do it for me?

I had already been interested in learning how to make voice experiments, thanks to the inspiring work of some super-talented colleagues. Seeing what they could do with Actions on Google and Dialogflow got me interested in finding an excuse to learn the tools myself. They seemed like prime tarot bot-making material, so I thought: instead of making a “Hello, world!” app, why not make a “Hello, The World!” Action instead?

This is how I ended up making Tarot Time: an Action for the Google Assistant that lets you pull a tarot card to guide your day, or look up more details regarding cards you’re curious about. Tarot Time shuffles your cards, shows you the imagery of your chosen card, and guides you through its light and shadow attributes. It works on devices with screens (like phones and Smart Displays) and those that are speaker-only. On phones, you can also ask it to send you a tarot card every morning automatically via a notification in case you forget to draw your card on your own — like me!

I strongly suggest this ambiance for your Tarot Time experience, although technically anywhere will do.

All these different features touched on a ton of different moving parts, helping me learn a lot about how the pipeline worked. In this blog post, I’ll go over the process in general, as well as some of the hitches I ran into and learned from.

The basics

There’s a few moving parts to Tarot Time’s system:

The Google Assistant: the personal virtual assistant that you talk to. You might’ve used the Assistant on your smartphone or smart speaker, like the Google Home.

Actions on Google: a platform for developers to extend the default functionality of the Google Assistant. There’s a big Actions directory of Actions you can invoke if you want to, say, make music, buy movie tickets, or hear a piece of positive news.

Dialogflow: Google’s natural language understanding developer tool for building conversational experiences for the Google Assistant. It uses machine learning to understand the intent and context of what a user says in order to respond in the most useful way.

Firebase: a handy one-stop shop that can support your Action in a couple ways:

Once Dialogflow parses the user’s question, we will want to respond to the user with some answer. More likely than not, most of our responses will require us to execute some code to put together our answer. That means, of course, we need somewhere to run that code. One option is to deploy that code to Firebase, pipe our Dialogflow data into Firebase, and use Firebase Functions to run the code and return a response. We can then debug using Firebase Functions’ logs.
Maybe we also want to host rich media (like images of tarot cards). You can put all those static assets in your project’s Firebase Hosting and immediately start referencing them in your responses.

It looks like a lot of parts at first, but once you understand how they all work together, their modularity makes a lot of sense.

There are several official codelabs that can walk you through the nitty-gritty of making an Action (level 1 and level 2), so let’s focus on Tarot Time’s development in particular and common mistakes to watch out for.

Defining what it should do

There were a few things I knew I wanted my bot to do:

It should be able to draw a random card when asked.
It should be able to draw a random card automatically, every day, and send me a push notification about it.
It should be able to return information on a specific card you ask about.
The card descriptions should also include images of the cards (where applicable).
It should be able to understand different names for the same cards (e.g. “The Hierophant” and “The Pope” are the same) and return the correct information.

First, I needed to gather my tarot data. I was super lucky that Mark McElroy had released a bunch of public domain tarot interpretations, and Allison Parrish had already organized them into JSON. For visuals, I found a beautiful reproduction of the 1760 Tarot de Marseille deck called the CBD Tarot by Dr. Yoav Ben-Dov, who generously released his deck under a Creative Commons license.

After running through the level 2 codelab, which taught me how to make an Action, set up the Dialogflow agent, and get everything ready for local development, I was ready to start development.

Setting intents and fulfillment

Intents

The first thing to do was to figure out exactly what my users were going to ask my Action to do. These mappings of user phrases to software functionality are handled by Dialogflow, and they’re called intents. In my case, I knew I had five intents:

A default welcome intent, for the agent to explain what the Action was and what it could do
An intent to draw a random card for the user
An intent to tell the user information about a specific card
An intent that would return credits for the Action
A default fallback intent, in case the agent didn’t know what the user wanted/had said and needed a repeat

Some of these intents would always have the same reply (like the welcome and credit intents). Others would require more code (like searching cards). Further, all of these intents have different ways they can be phrased. One user might say “Pull a tarot card for me, please,” while another, more curt user might just say “Gimme a tarot card” or even just “random card.” All of these phrases have the same intent — to have our code return a random card — but they’re phrased in very different ways. How do we let our Dialogflow agent know that all these different phrases are asking for the same thing?

Well, we do what we would do with any other machine learning system: we provide a bunch of examples to train it on. In Dialogflow, that means adding a broad set of training phrases that cover various ways that a user might think to invoke this functionality. You can see the phrases for our random card intent below.

Note that these phrases have a range of tones, lengths, and terms used.

All of these phrases, and any phrases sufficiently similar to them, will invoke this random card drawing functionality.

Setting up the pipeline

So we’re triggering our random card intent — now, how do we actually get that random card and make our reply?

This is where fulfillment comes in. Dialogflow can reply to intents in simple ways on its own, or — for intents that require some processing — it can kick them over to another service to fulfill the user request. For this, it supplies both an inline-editor and webhook support. I prefer to write code in my own editor, so I decided to skip the inline-editor and make a project on my desktop.

I followed the instructions on this page to set up my little divination codebase. After deploying the codebase, I had to tell Dialogflow to actually send data to Firebase. I found the Firebase deployment URL, went to the Dialogflow console, hit Fulfillment, enabled Webhook, and saved. This allowed Firebase to communicate with Dialogflow.

Below are some screenshots of this process from the level 2 codelab, which has more detail:

This is where you find the appropriate URL…

But you’re not done yet! Each intent that uses a webhook has to have webhooks explicitly enabled. If you don’t enable webhook calls, you’ll probably get the following error when you try to trigger your intent:

MalformedResponse
‘final_response’ must be set.

This MalformedResponse error can be triggered by several things — you can learn the nitty gritty in this excellent post by the official Google Assistant team. In my case, it was almost always forgetting to toggle webhooks for new intents. Several times, I added new intents and cursed at my computer when the MalformedResponse error message came back, only to realize I didn’t actually allow webhook fulfillment for that intent. Whoops! It’s the little toggle down at the bottom:

Fulfillment > Enable webhook call for this intent.

With this enabled, Dialogflow can communicate individual intents to Firebase. In the code, that looks like the following:

app.intent(‘random-card’, (conv) => {
// draw a card and respond to the user with the details
})

If you wanted to split up the “drawing card” and “card detail” phases (i.e. drawing the card, then allowing the user to choose to get more detail or exit the conversation), you could do so with follow-up intents. In my case, I rarely remember any details on cards, so I don’t need to make follow-up intents — I know I’ll always want more detail. So, I keep everything in our one random-card intent.

You’ll notice that when this intent is triggered, we receive a conv object from the client library, with which we build our response.

Crafting the response

Okay, so we have our general Fulfillment pipeline rigged up. Next, we’ll design what this thing actually returns.

Simple responses

First, let’s go over what the Assistant actually speaks out loud. In our app.intent function above, we get a conversation object. The most important functions to be aware of are conv.ask and conv.close. conv.ask is for any statement that should be said and leaves the mic open for the user to reply, while conv.close simply says the response and closes the mic.

A quick note about user prompts: If you activate the mic, you must make it clear to the user that you’re waiting for a response from them. Not doing so is a violation of the Actions on Google policy. Use conv.ask mindfully and respectfully.

Here are two concrete examples. Let’s say that our Action is designed for tarot experts who know all the card meanings already. Because they know the card meanings by heart, we’ll just pick a card and tell them the name, then close the conversation. In this case, we’re not waiting for a user response, so we would just use conv.close.

conv.close(`${remark} Your tarot card for today is ${name}.`);

But let’s say we want to broaden our user base to both tarot experts and folks who might not know all the card details already. In that case, we’d want to give the user the option to ask for more detail about their card. In that case, we’d want to use conv.ask to activate the mic, and make it clear that we’re waiting on a response.

conv.ask(`${remark} Your tarot card for today is ${name}.`);
conv.ask(‘Would you like to learn more?’);

For the times that we just want Tarot Time to speak aloud and/or make chat bubbles, we can just call the appropriate function (either conv.ask or conv.close, depending on whether we want a response) and pass it a string, as we did in our two examples above.

But sometimes, we’ll want to add more complex stuff, like tarot card images with alt text. How do we do that?

Rich responses

At that point, we’re entering rich response territory. Rich responses let you combine suggestion chips, cards, link-outs, carousels, and other rich media to your chat bubbles and TTS. Much like how our simple response is an ordered stack of chat bubbles, our rich response is an ordered stack of chat bubbles and rich media objects. In our case, where we want to display a tarot image as well as a longer description, the BasicCard object is the perfect choice.

conv.close(new BasicCard({
    title: name,
    image: new Image({
        url: PARAMS.cardFolder + cardMappings[name.toLowerCase()],
        alt: ‘An image of ‘ + name,
    }),
    text: description // the longer card description, displayed but not spoken aloud
}));

But wait! Our response can’t begin with this BasicCard, because the first item in a Rich Response must be a simple response. What that means for us is that we need some simple response — i.e., a chat bubble — before our BasicCard. If we skip that and kick off our response with a BasicCard, you’ll get an error like the following:

MalformedResponse
final_response.rich_response: the first element must be a ‘simple_response’, a ‘structured_response’ or a ‘custom_response’.

There are several ways to fulfill this requirement. In Tarot Time’s case, I just added a simple string-based conv.close — i.e., a chat bubble — before the card that would state the card name. The BasicCard could then follow.

You can see all this culminate in the (simplified) code snippet below:

app.intent(‘random-card’, (conv) => {
    const deck = tarotData.meanings[“tarot_interpretations”].slice();
    shuffle(deck)
    currentCardData = Object.assign({}, deck[0])    const remark = remarks[Math.floor(Math.random()*remarks.length)]    const name = getCardName(currentCardData[‘name’])    conv.close(`${remark} Your tarot card is ${name}.`);    const description = createDescription(currentCardData);    conv.close(new BasicCard({
        title: name,
        image: new Image({
            url: PARAMS.cardFolder + tarotData.cardMappings[name.toLowerCase()],
            alt: ‘An image of ‘ + name,
        }),
        text: description
    }));})

And you can see how it looks in this gif:

Now we’re talking! Pun intended.

Entities

So far, we’ve set up our pipeline for our card draw. Next, I wanted to implement searching for specific cards. This is trickier than just asking for a random card: someone might say “Tell me about The Empress,” and Dialogflow has to know that “The Empress” is a kind of tarot card. But of course Dialogflow doesn’t inherently know what “The Tower” or “Death” or “The Five of Pentacles” are. Further, it certainly doesn’t know that some things are synonymous, like that sometimes folks call the ‘suit of wands’ the ‘suit of rods’ instead. I had to tell it all that first.

Dialogflow makes this easy with its entity system. At its heart, an entity is just a category of things that Dialogflow can recognize and extract as parameters from user queries. Consider the question “What is the weather in New York City?”: we have some intent (asking for some place’s weather report) and some entity (that place being a geographical location called New York City).

Dialogflow has a wide range of built-in system entities, like colors and geography, that it comes pre-trained to understand. In our weather example, Dialogflow will immediately understand that New York City is a city, and will tag the phrase “New York City” with the @sys.geo-city entity. That parameter will be cleanly passed the appropriate code. Dialogflow also has a system for developers to define their own entities, for domain-specific words and phrases. If that’s not enough, you can also do things like take a system entity and a developer entity and combine them into one composite entity, for example.

We don’t need anything nearly that complex for Tarot Time, though — all we need is to make Tarot Time understand that a phrase like “The Wheel of Fortune” is actually a type of tarot card (and that “The Wheel” is a synonym for it).

I put together a CSV file of different cards and any synonyms they had and uploaded it to Dialogflow, leading to the following table.

Note that for a reference value to be matched, it needs to be included as a synonym for itself.

As with the random card intent, we also need to put together a broad list of training phrases for this intent as well. In the training phrases below, I try to cover my bases by including card names in a variety of positions in the sentence as well. The way the user mentions the entity has an impact on whether that entity is recognized, so we want to demonstrate a variety of ways the user might refer to it. I then manually tagged the entities where needed.

In my search-cards intent screenshotted below, I can see that Dialogflow should understand that, when someone says “Tell me about the high priestess,” the phrase ‘the high priestess’ is referring to a member of my custom @tarotcard entity. (Note: it’s fine that “The Wheel Of Fortune” isn’t entirely highlighted — we list “The Wheel of Fortune” as a synonym for “The Wheel,” so we’re covered!)

Note that we give example phrases that have the card name in different positions in the sentence.

Dialogflow can then simply pass the parameter to the appropriate function in our webhook code. When a user triggers an intent, we can get any recognized parameters they’ve referenced along with our conversation object, as you can see below:

app.intent(‘search-cards’, (conv, {tarotcard}) => { 
// above: our intent, the conv object, and the entity
    currentCardData = findCardInList(tarotcard);
    … // the rest of our code as usual
});

So if someone says, “Ask Tarot Time about La Papesse,” the following happens:

Dialogflow parses that spoken language and extracts the meaningful information (the user is triggering the search-card intent, and they’re mentioning an item in the @tarotcardentity, specifically La Papesse)
Dialogflow looks through our list of entities and synonyms to find a match (ah, La Papesse is a synonym for The High Priestess)
Dialogflow kicks that value (The High Priestess) over to our code through the tarotcard parameter of our function
Our code finds the corresponding information in our JSON, puts together a response, and kicks it back to Dialogflow
Our app says the response

Deep links

As I said in the intro, I am fundamentally lazy. As a coder, I prefer to make tools that do all the work for me. At the moment, if I want to ask for a daily card, I first need to invoke Tarot Time (“Talk to Tarot Time”) and then ask Tarot Time for my random card (“Give me a random card”). That’s two full steps! I want to cut that down by at least half!

You can do that with deep links, and it’s very simple: in your Dialogflow Integrations, simply select whatever intents you want to be able to trigger in one step, and add them to the implicit invocations list. Now I can ask for my daily card, or about a specific card, in just one step (“Ask Tarot Time for my daily card” or “Ask Tarot Time about Temperance”). Much more natural!

Daily reminders

But what if I want no steps? Remember, this whole project started because I don’t have the available brainspace to remember to draw a card every day. I want the device to do it for me automatically every morning. On phones, you can do this with daily reminders.

Enabling daily reminders — push notifications that trigger a specific intent at the same time every day — is easy. All you have to do is tell your Action to let folks opt into triggering your intent on a daily basis. To do that, go to your Actions console, and then to Build > Actions. Under the action you want to offer (for me, random-card), expand the User engagement section. Toggle “daily updates.”

Note: we don’t need to toggle “push notifications.” That toggle specifically refers to ad-hoc updates controlled by your app — e.g., breaking news updates that could occur throughout the day — not scheduled daily ones.

Now, if the user asks for a card pull, the Assistant will ask them if they want to have a tarot card sent to them daily. If they say yes, the Assistant will notify them with a new card draw every day at whatever time they choose.

I would love to pretend I’m an early bird, but 10AM is about the earliest my brain really turns on.

Note that in order to implement this — even just to test it on your device — you’ll need to submit your Action for either beta or public release.

An afternoon tarot card, to enjoy over a midday coffee.

Look ma, no steps!

More resources

I had a lot of fun goofing around with Actions on Google and Dialogflow to make my one-day tarot hack. Wanna play with voice and bots? You can check out other voice experiments on Experiments with Google for inspiration, get started with this handy codelab, and sidestep common errors with this post. Once you’re ready to rock, you can register and publish your Action for the world to use by following the instructions here. Whatever you make, share it with the hashtag #voiceexperiments so we can all play with it!