Call centers focus on automation. There is an ongoing debate on this topic Is that good?but it’s happening – and quite possibly accelerating.
According to research firm TechSci Researchthe global artificial intelligence market for contact centers could grow to nearly $3 billion in 2028 from $2.4 billion in 2022. Meanwhile, a recent study found that about half of contact centers they plan to adopt some type of artificial intelligence next 12 months.
The motivation is relatively obvious: call centers want to reduce costs while increasing the scale of their operations.
“Call center-intensive companies looking to scale quickly without the constraints of a contact center workforce are very open to implementing effective AI-powered voice agent solutions,” entrepreneur Evie Wang told TechCrunch. “This approach not only reduces overall costs, but also reduces wait times.”
Wang is one of the co-founders Repeat AI, which provides a platform through which companies can create AI-powered “voice agents” that answer customer calls and perform basic tasks similar to scheduling meetings. Retell agents use a combination of enormous language models (LLM) tailored for customer support applications and a speech model that offers voice to the text generated by LLM.
Retell’s customers include some contact center operators, but also small and medium-sized businesses that frequently handle high call volumes, similar to telehealth company Ro. They can create voice agents using the platform’s low-code tools, or they’ll upload a custom LLM (e.g., an open-source model like Meta’s Lama 3) to further customize the experience.
“We are investing heavily in voice conversations because we consider it the most important aspect of the AI voice agent experience,” Wang said. “We don’t see AI voice agents as mere toys that can be created with a few lines of prompts, but rather as tools that can offer significant value to businesses and replace complex workflows.”
Retell performed well enough in my transient tests, at least on the connectivity side.
I scheduled a chat with the Retell bot using the demo form on the Retell website. The bot walked me through the technique of scheduling a hypothetical dentist appointment, asking questions similar to preferred date and time, phone number, and so on.
I can not say that the bot’s synthetic voice was the best I’ve heard in terms of realism – definitely not on par with Eleven Labs or OpenAI’s text-to-speech API. (Update: Wang told me that Retell uses a custom ElevenLabs voice, which can explain the lower quality). Wang, in Retell’s defense, said the team’s fundamental focus was on reducing latency and handling edge cases, similar to pauses that may occur in a conversation.
Low latency: In my test, the bot responded to my answers and follow-up questions with almost no hesitation. And he stuck to the script. Try as I’d, I could not confuse him or make him behave in a way he shouldn’t. (When I asked the bot about my dental records, it insisted that I speak to the office manager.)
So, are platforms like Retell the way forward for call centers?
Maybe. For basic tasks like scheduling meetings, automation makes a lot of sense, which is probably why each startups and large tech companies offer solutions that directly compete with Retell’s solutions. (See Parloa, PolyAI, AI Contact Center Google Cloud, etc.)
This is low-hanging — and seemingly revenue-generating — fruit. Retell claims to have tons of of shoppers and all of them pay per minute to talk to a voice agent. Retell has raised a total of $4.53 million in capital to date, courtesy of donors including Y Combinator (where the company was incubated).
But the jury is struggling with more complicated questions, especially given LLM’s tendency to make up facts and go off course even with safeguards in place.
As Retell’s ambitions grow, I’m curious to see how the company addresses many of the industry’s entrenched technical challenges. At least Wang seems confident in Retell’s approach.
“With the advent of the LLM and recent breakthroughs in speech synthesis, conversational AI is becoming good enough to create really exciting use cases,” Wang said. “For example, with latency of less than one second and the ability to interrupt AI, we have seen users speak in fuller sentences and converse as if they were talking to another person. We strive to make it easier for developers to create, test, deploy and monitor AI voice agents to ultimately help them achieve production readiness.”