The audio input to a voice-enabled car is a phenomenon behind the scenes. The end user does not usually notice the audio chain unless things go wrong. It's similar to working as a scene helper on Broadway: a difficult and even ungrateful job, full of unexpected obstacles, not noticed until the curtain falls in the middle of the great solo of Kristin Chenoweth.
Audio has a long, sometimes difficult route in the connected car - traveling from your mouth all the way to the "Hearing" speaking recognizer what you said. In the short version of this trip, there are two halves:
Part 1: Inside the Car Cabin
The first half of the journey takes you through the interior of the vehicle - from the mouth to the microphone of the car. Unfortunately, cars can be very noisy environments. If you could hear the blow, the grunt and the dragging of the boxes, would you really enjoy the play? Think of everything you can hear in the car: engine revs, bumps, tractor trailers that go right, children playing in the back seat, wipers, climate control noise ... and finally your voice.
Take the Potholes: A common condition on the roads of Michigan I frequent ... You commit to the VR system and say "Call Al". You merge to a descent ramp at the right time - and the VR system can hear "Call <BUMP> <BUMP> <BUMP>" instead. Vocal competition is another common pitfall in the voice-enabled car. While driving with your children, try changing the radio station by voice. The VR system now has to interpret what is meant by "DADD ..." - "Tuning to DAD 100.3 FM" - "... DYYY". Noise and interference such as these can cause significant recognition errors and other unwanted VR system behavior.
Part 2: Within the voice recognition system (VR)
The second half of the audio tour can be equally difficult. Having a properly structured audio setup in an infotainment system is critical to a successful user experience. During a speech recognition dialog, the system must know when to start and stop listening to the user (the "listening window"). Like the set designers who open and close the curtain during scene changes, this has a significant impact on the user experience. If the curtain opens early, the audience sees what they should not. If it closes too fast, the audience will miss key elements of the plot. In the case of the car's VR, this is equivalent to hearing the system "<BEEP> Dial 911" or "Dial 1-800-5 // cutoff", respectively. Both situations can cause the user to get an unexpected result.
Other areas of audio configuration also present potential difficulties for the end-user experience. A common reaction to a VR system that does not work is to speak louder with each fault (like humans, sometimes we do this in conversation to make sure they are heard clearly). But what if the audio level in the voice recognizer is already set to too high a volume? Yelling will only make the problem worse, frustrating the user with multiple failed acknowledgments. This is why proper tuning and tuning is so important - a scene that can be seen playing in a Dragon Drive demo.
The voice-enabled car of the future will selectively ignore driving and passenger noise, allowing a seamless and seamless operator experience. Luckily for us, the future is fast approaching. Today, there are exciting new technologies aimed at addressing some of these common audio challenges. New developments in digital signal processing allow stationary noises (such as road noise and fan) and non-stationary noises (such as road bumps) to be well suppressed. Other new technologies allow the system to bypass interfering speakers (a variant is called "off-axis suppression"). With this enabled, passengers are able to hold side conversations while you speak voice recognition commands without worry.
So, the future looks bright for voice-enabled cars, but until these exciting technologies are in place, what can you do to improve the experiences with speech in the car? Here are some suggestions: if you are using a requesting system with a beep - wait! Do not talk about it. Speak in your normal voice, in a comfortable volume (maybe a little louder if you are in a noisy environment). Remember, just like the silent stage players who work diligently behind the scenes, the audio system in your voice-enabled car is working tirelessly to provide the best user experience.
Stay tuned for the next article in this series, which will delve into an audio technology called barge-in voice. With the voice bar enabled, the user can speak voice commands during fast playback and even talk about the beep, allowing for a more conversational experience.