The tech world — as well as pretty much every vehicle in existence — seems to be leaning heavily towards voice activated-devices: Siri, Amazon Echo, Facebook M, “OK Google.” It
should make sense that we would want to speak to our digital assistants. After all, that’s how we communicate with each other. So why, then, do I feel like such a dork when I say “Siri,
find me an Indian restaurant”?
I almost never use Siri as my interface to my iPhone. On the very rare occasions when I do, it’s when I’m driving — by myself, with
no one to judge me. And even then, I feel unusually self-conscious.
I don’t think I’m alone. No one I know uses Siri, except on the same occasions and in the same way I do.
This should be the most natural thing in the world. We’ve been talking to each other for several millennia. It’s so much more elegant than hammering away on a keyboard. But I keep
seeing the same scenario play out over and over again. We give voice navigation a try. It sometimes works. When it does, it seems very cool. We try it again. And then we don’t do it any more.
advertisement
advertisement
I base this on admittedly anecdotal evidence. I’m sure there are those who continually chat merrily away to the nearest device. But not me — and not anyone I know, either. So, given
that voice activation seems to be the way devices are going, I have to ask why we’re dragging our heels to adopt.
In trying to judge the adoption of voice-activated interfaces, we have to
account for mismatches in our expected utility. Every time we ask for something – like, for instance, “Play Bruno Mars” — and we get the response, “I’m sorry, I
can’t find Brutal Cars,” it’s natural we’d be frustrated. This is certainly part of it. But that’s an adoption threshold that will eventually yield to sheer processing
brute strength.
I suspect our reluctance to talk to an object is found in the fact that we’re talking to an object. It doesn’t feel right. It makes us look addle-minded. We make fun
of people who speak when there’s no one else in the room.
Our relationship with language is an intimately nuanced one. Speech is a relatively newly acquired skill, in evolutionary terms,
so it takes up a fair amount of cognitive processing. Granted, no matter what the interface, we currently have to translate desire into language, and speaking is certainly more efficient than typing,
so it should be a natural step forward in our relationship with machines.
But we also have to remember that verbal communication is the most social of things. In our minds, we have created a
well-worn slot for speaking, and it’s something to be done when sitting across from another human.
Mental associations are critical for how we make sense of things. We are natural
categorizers. And if we haven’t found an appropriate category when we encounter something new, we adapt an existing one.
I think vocal activation may be creating cognitive dissonance in
our mental categorization schema. Interaction with devices is a generally solitary endeavor. Talking is a group activity. Something here just doesn’t seem to fit. We’re finding it hard to
reconcile our usage of language and our interaction with machines.
I have no idea if I’m right about this. Perhaps I’m just being a Luddite. But given that my entire family, and
most of my friends, have had voice activation-capable phones for several years now and none of them use that feature except on very rare occasions, I thought it was worth mentioning.
By the
way, let’s just keep this between you and me. Don’t tell Siri.