Running To Stand Still

The challenge of interpreting silence


Bradley Metrock is the founder, publisher, and producer of a podcast, a newsletter, This Week in Voice, and one of the top Voice and Artificial Intelligence events happening in the US, Project Voice 2021.

Starting this week he will be contributing content from his newsletters to The UN Brief, on all the latest in Voice and AI.

By Bradley Metrock

In 1992, the US National Football League (NFL) and CBS (which aired the Super Bowl) lost a massive 20-25 million viewers (and 10 ratings points, a sizable chunk) to then-upstart FOX, which dared to counterprogram In Living Color during the typically boring Super Bowl halftime show.

Incensed, CBS executives the following year decided they would make sure that didn’t happen again, hiring the biggest and baddest entertainer in the world to perform during halftime: Michael Jackson.

The rest is history, as this catapulted the Super Bowl halftime show to become one of the biggest and most prestigious gigs in all of entertainment, singularly as a result of Michael Jackson’s otherworldly and memorable performance.

What often gets forgotten, though, is what Michael Jackson did at the start of this performance, when he first showed up on stage.

Nothing.

Michael Jackson stood there, almost motionless, for nearly two entire minutes.

A 30 second Super Bowl ad that year cost approximately $850,000, so this silent introduction cost nearly $3.5 million dollars, in 1993 money.

That’s one loud statement.

I had a chance to hear a talk Cathy Pearl of Google gave back in 2019, and during her presentation, she mentioned that part of the complexity of human communication is how humans utilize silence.

One human being not speaking to another can translate to a wide variety of words or expressions, and the challenge for voice assistants and AI is to be able to effectively use context to figure out what silence might mean.

Imagine a voice assistant being thrust into this situation, shown within the 2001 remake of Ocean’s Eleven, in which Danny Ocean (George Clooney) has a one-sided conversation with his friend Rusty (Brad Pitt) about if they need one more person for their upcoming heist.

Humans have been talking – and sometimes, not talking – to each other for a long time.

We are born with 43 muscles in our face, communicating approximately 10,000 unique combinations of expressions, all without saying a word.

Adding to the complexity of non-verbal communication is how one look, one expression, one glance might mean one thing to one culture, and something completely different to another.

It’s a problem so large, a vast multitude of companies will be required to solve it.

The more we start talking about it, the better.

You’ve got to cry, without weeping.
Talk, without speaking.
Scream, without raising your voice.


Subscribe to This Week in Voice

Project Voice 2021 is the #1 event for voice tech and AI in America. In person, in April 2021, in Chattanooga, Tennessee. Use the promo code to save when you register here.