How the Shift to Remote Work is Accelerating Speech Recognition

Print Friendly, PDF & Email

The New York Times recently published an article demonstrating how far computer vision has come in just a few years. When the technology initially emerged back in 2014, AI-generated images of human faces were crude at best. Now, they’re nearly indistinguishable from the real thing. If one facet of AI can progress this rapidly, it begs the question: What’s next?

Speech recognition is the next area of AI poised for hypergrowth, and it’s being fueled exponentially by the near-universal shift to remote work. The days of sitting in a conference room with a whiteboard and taking notes are over—possibly for good. As a result, video conferencing platforms like Zoom have had a massive influx in adoption, and speech recognition startups like Krisp and Deepgram have secured substantial funding despite the economic downturn.  

The speech recognition market was projected to reach just over $29 billion by 2026, but that figure will likely end up much higher due to the move to remote work driven by the pandemic. With nearly all of today’s meetings taking place virtually, more voice data is being generated than ever before. Speech is a vital component of human communication and learning, making this new data exceptionally valuable within the context of a remote work culture. 

Let’s take a look at the technology’s applications and what the future of speech recognition may hold. 

The death of trivial use cases

Up until recently, use cases for speech recognition have provided only marginal real-life value. Take Google Duplex for example, which uses speech to make voice calls that book restaurant, car, and hotel reservations. The technology has been met with resistance from service workers, who are reluctant to answer calls that don’t display a real person’s name, so its benefits have been minimal.

Similarly, personal assistants that leverage speech recognition—like Amazon’s Alexa—are lacking in value (not to mention rife with privacy concerns). Despite the fact that over 100 million devices with Alexa have been sold, once their novelty wears off, the devices often go unused, or are limited to menial tasks like playing music and reciting the weather forecast. 

So, where does the real utility of this technology lie? 

Uncovering true value in the era of remote work

As speech recognition gains traction, we can expect to see its trivial use cases fall by the wayside as high-value applications—including those that address the challenges of working remotely—take precedence.  

In the age of remote work, there is a critical need for the capacity to collect, store, process, and make sense of voice data being gathered from video conferencing software. By analyzing this data, speech recognition technology can determine specific speakers within meetings, the sentiment of each speaker, how long each person spoke, the content of the meeting, important topics discussed, create action items, and much more. The ability to quickly process and gain insights from voice data will be imperative for companies to stay competitive moving forward.

What the future holds

As speech recognition technology continues to evolve, it’s reasonable to presume that it will one day become as advanced as computer vision. Similar to how we can no longer distinguish between an actual photograph and a face created by AI, one day we won’t be able to differentiate between a human talking and AI-generated speech. 

One example of an application for this is voice advertising for platforms such as podcasts. Currently, podcast hosts record voiceovers for advertisements, but once technology can replicate a human voice exactly, there will no longer be a need for this. This will extend to other types of media including movies, TV advertisements, AI-driven voiceovers for audiobooks, and more. 

Language translation and learning applications, like Duolingo, will also benefit greatly from advancements in speech recognition. Imagine applications that can teach people languages by having full conversations in real time. Or those that can connect people by providing accurate and lightning-fast translation at the user’s fingertips. 

AI and its respective branches—including speech—are undeniably powerful, and ethics will come into play as the technology develops further and garners other use cases. However, it’s clear that speech recognition is currently providing remarkable value within the context of remote work, and will be an integral tool for companies’ success now and in the future.  

About the Author

Ryan Scolnik is the VP of Data Science at FortressIQ where he leads the company’s data science, computer vision and AI strategy. Prior to FortressIQ, Ryan was a lead data scientist at 6sense, a predictive analytics company, and senior marketing analyst at JP Morgan Chase. Ryan holds a PhD in Statistics from Florida State University and enjoys Kaggle competitions and long bike rides in his spare time.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1

Speak Your Mind

*

Comments

  1. One of the technologies that are trending right now is voice technology. I agree that speech recognition might be the next best thing. There are many advantages to using this technology. It can help to increase productivity in many businesses, such as in healthcare industries, It can capture speech much faster than you can type, You can use text-to-speech in real-time, Helps those who have problems with speech or sight, and many more. My kid already uses the search by voice feature in Youtube, and Youtube easily recognizes my kid’s voice and understands what my kid wants to watch, giving my kid the videos he wants to see. Thank you for this informative article.