Cooking with voice: misadventures in the kitchen with Google Home

Two weeks ago I attended an event at Yummly, a recipe app recently bought by Whirlpool that’s trying to make a smarter kitchen. There is a big push to make cooking easier while having appliances do more by tying their behaviour, whether it be finding ingredients in the fridge, identifying them on a supermarket shelf, or controlling the oven settings. While that future is a bit farther away, there is a future that’s already here: cooking via voice command. Today, only 18% of smart speaker owners are currently using it for cooking requests, but that’s bound to increase as kitchens become smarter, and doing this well will have real value in a smart kitchen.

Voice UI in the kitchen has so much potential to be better than an open recipe book that needs to be propped up and constantly referred to, each time while scanning the page for the right location to find the information, or a tablet that needs to be turned on with a password or touch and scrolled through to find the right information. That said, voice cannot just recite a recipe meant to be read. There need to be a few changes.

Intrigued, I wanted to see what the current state was. I decided to make Chai tea. (I’m paraphrasing the dialog I had with Home.)

Me: OK, Google, how do you make Chai tea?

Home: I found a recipe for Homemade Chai on Epicurious, would you like to make it?

Then Home said that there were 9 ingredients, that it would list them and pause between each one. It then proceeded to tell me ingredients like “2-inch piece fresh ginger, cut into thin rounds” Is that peeled? Not? I decided to peel it anyway. The second item was a 2 cinnamon sticks. Conveniently here I could ask Home if I could replace sticks with powder. Home read me an answer, with a ratio (I’ll get back to this later as it didn’t help me at all) but I ended up finding sticks in my cupboard. At this point I realized that I’d only be making half the recipe amount because I didn’t have enough sticks.

As it continued adding spices, I decided to just dump them all in a saucepan. Still listing ingredients, Home told me I needed water, tea bags, milk, and sugar in addition to the five spices. I just added all in the pot.

Home: Would you like to hear the instructions?

Me: Yes.

Home: Add all the spices to a saucepan and crush them slightly.

Me: [Too bad, they were all in the water/milk mixture but I tried crushing them a bit.] Next!

Home: Bring to a boil.

Me: [Brought to boil] Next!

Home: Lower heat and simmer for 10 minutes, would you like me to set a timer?

Me: Yes!

Home: Timer set for 10 minutes.

Home: [10 minutes pass.] Alarm!

Me: OK, next!

Home: Remove from heat. [Continues alarm.]

Me: OK Google, stop the timer! Next!

Home: Add tea bags and steep for 5 minutes.

Me: ! [Teabags were already added before, skipping this step.]

Home: Discard tea bags. Add milk and sugar.

My Chai tea with artistically placed spices

Me: [Discard, sure. Milk and sugar were already added.] Next!

Home: Bring tea just to simmer over high heat, whisking until sugar dissolves.

Me: [Nope, skipping this, too, as it was done already and incorrectly.] Next!

Home: Strain chai into teapot and serve hot. This was the last step.

Me: [Strain and sip.]

Even with all my mistakes, it actually turned out OK. Then again, Chai is a forgiving recipe.

Beyond the actual recipe, I do want to share a few thoughts on where cooking with voice is better than with a written recipe and where it still fails.

First, what I liked:

  • Home did a great job finding a good recipe. It found one fast, from a reputable site, and, above all, one that turned out tasty. Yes, Home is powered by Google, and, yes, I’ve heard that they do search quite well. Still, this was well done and it’s important to get this right.
  • When a step included doing something timed, Home asked me if I wanted to set a timer. This was a great help. That said, the interaction can be smoother and my asking for the next step (or any other voice command) should have turned the alarm off, instead of requiring me to ask it to stop specifically. Also, it seems that this only works for a certain minimum because when it said to steep the tea bags for five minutes, it didn’t offer a timer.
  • Home did a good job at parsing the recipe, pausing between each ingredient and instruction step, for no matter how long it took me to do that step. It never lost the thread of communication. That said, it wasn’t always waiting to hear my next question and often required “OK, Google” to continue.

Second, what I’d like to improve:

The bigger challenge is that the entire paradigm needs to change. Today a recipe is recited as it is written: first ingredients, then instructions. The big difference between reading a written recipe and following spoken instructions is that in the former the chef can go back when reading instructions and see the amount and preparation instructions for each ingredient. In the latter, reading instructions that refer to the ingredients is problematic. For example, the instruction to put the first five ingredients in the saucepan. But what were those five? When using a written recipe, a quick glance upwards gives the answer. When listening to spoken instructions, they need to list those ingredients in that step, even though the written recipe doesn’t include them.

Another example is ingredients list. It’s read aloud, one ingredient at a time, with prep and amounts. Some people cook like this and measure and prepare all ingredients before cooking. Others, like me, just like to make sure they have enough of each ingredient, then measure and prepare it at the step where it’s added to the recipe. When reading a recipe, cooks can work according to their preferred mode. With voice instructions, that’s currently impossible. What’s needed is an initial read-aloud of the ingredients with quantities so that cooks can verify they have all of them, and then a repeat of the ingredients for actual assembly and prep. 

Beyond the paradigm change, there are a few smaller things that can be improved:

  • Can I replace X with Y? Here’s where voice can shine. My recipe called for cinnamon sticks and I wanted to replace with powder. I asked Home if I can do that and it read me the relevant passage with a usable ratio of powder per stick. But the recipe called for two sticks, could Home just have given me the correct amount instead of the ratio? Yes, in this case it was to multiply by two, but this is something computers, in general, do better than humans. Do the math for us! Tell me “you’ll need 1 teaspoon ground cinnamon instead of 2 sticks.”
  • How much are we making? Add Home skipped the part about how many servings a recipe yields, which would have been very helpful to me. I only realized it when it read the amount of required water and I realized it was too much. It would be helpful to mention this bit of info ahead of time.
  • Can I make half? Another place where voice can shine is quantities. My recipe said it serves 6 and I only wanted to make it for 2. Home could ask after saying the current quantities of a recipe how much I’d like to make. Then it could automatically adjust the ingredient amounts, and just read what is necessary for the quantity I want. If there’s an undividable ingredient in the recipe, say an egg, Home could calculate the corresponding ingredient amount and say something like “There’s an egg in this recipe. You can make either 2 servings or 4 instead of the receipe’s 8. Which would you like?”
  • How much do I need again? This can allow users to reference the information they need when they need it.
  • How many calories/protein/carbs/vitamin D does this recipe have? Helpful for those on specific diets and this is just another calculation based on an understanding of the ingredients. It could also be compared to the RDA for each.
  • What size pot/pan should I use? I switched to a larger saucepan as I was adding water. Home can do that calculation for me and recommend a 4 quart saucepan from the beginning.

These are just a few things that can help make cooking easier with voice. If there’s a linked screen, Home could also show what the finished dish should look like, videos of prep techniques, or what an ingredient looks like. Voice could really make cooking more accessible to many more people, and that may be the biggest win of all.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s