New AI Voice Is So Realistic People Can’t Tell It Apart From A Real Human

AI Voice Computational Linguistics Concept

iStockphoto


Artificial intelligence startup Sesame recently released an AI voice demo that is so realistic people can’t tell it apart from a real human. In fact, it’s so realistic that it is reportedly making people both amazed as well as very uncomfortable.

According to Senior AI Reporter Benj Edwards at Ars Technica, who had a conversation with Sesame’s new Conversational Speech Model (CSM) for about 28 minutes, “The synthesized voice was expressive and dynamic, imitating breath sounds, chuckles, interruptions, and even sometimes stumbling over words and correcting itself. These imperfections are intentional.”

“At Sesame, our goal is to achieve ‘voice presence’ — the magical quality that makes spoken interactions feel real, understood, and valued,” Sesame explained on their website. “We are creating conversational partners that do not just process requests; they engage in genuine dialogue that builds confidence and trust over time. In doing so, we hope to realize the untapped potential of voice as the ultimate interface for instruction and understanding.”

Someone else who tested out Sesame’s AI voice demo and posted their reactions on the Hacker News forum wrote, “It was genuinely startling how human it felt. Apparently they are planning on open-sourcing some of their work as well as selling glasses (presumably with the voice assistant). I’m very excited to have a voice assistant like this and am almost a bit worried I will start feeling emotionally attached to a voice assistant with this level of human-like sound.”

Want to try it out for yourself? Just click here.

Another person who spoke with Sesame’s AI voice demo, Mark Hachman, a senior editor at PCWorld, revealed, “Fifteen minutes after ‘hanging up’ with Sesame’s new ‘lifelike’ AI, and I’m still freaked out.”

One of the big reasons people should feel “freaked out” by this new human-sounding AI voice is obvious: scams. As Edwards writes, “As synthetic voices become increasingly indistinguishable from human speech, you may never know who you’re talking to on the other end of the line.”

Douglas Charles headshot avatar BroBible
Douglas Charles is a Senior Editor for BroBible with two decades of expertise writing about sports, science, and pop culture with a particular focus on the weird news and events that capture the internet's attention. He is a graduate from the University of Iowa.