Science & Tech

This AI-Generated Joe Rogan Voice is a Warning From the Future

Published

5 years ago

May 18, 2019

In 2017, a startup called “Lyrebird” made headlines with AI generated replications of celebrity voices that were extremely convincing.

Tracks posted to SoundCloud featured the voices of Donald Trump, Barack Obama, and Hillary Clinton making a pitch for the Lyrebird’s new technology. In the video, a Fake President Trump voice says, “They can make us say anything now.”

While the story gathered some attention initially, it quickly disappeared from the news cycle, except for just about one place, The Joe Rogan Experience Podcast. Rogan was fascinated by the technology and spoke about it at length on his podcast in the weeks after the news broke.

In the two years since, Rogan regularly informed his guests about the incredible technology, telling them that it’s only a matter of time before very real and recognizable voices will be mimicked and manipulated to say specific text for specific, and potentially nefarious, purposes. The sky is likely the limit as this technology advances, getting better results with less data.

Oddly enough, Rogan was the first celebrity target for the AI developers wanting to show off how far this technology has come in just two years. A video released this week features Rogan talking about training a hockey team made up of intelligent chimps, among other equally ridiculous and amusing rants.

“I just listened to an AI generated audio recording of me talking about chimp hockey teams and it’s terrifyingly accurate. At this point, I’ve long ago left enough content out there that they could basically have me saying anything they want, so my position is to shrug my shoulders and shake my head in awe, and just accept it. The future is gonna be really f***ing weird, kids,” Rogan said on Facebook this week.

Dessa, the AI startup responsible for the video, explained in a blog post that it will get easier and easier for the average person to make these types of replicas.

“Right now, technical expertise, ingenuity, computing power and data are required to make models like RealTalk perform well. So not just anyone can go out and do it. But in the next few years (or even sooner), we’ll see the technology advance to the point where only a few seconds of audio are needed to create a life-like replica of anyone’s voice on the planet,” the post read.

The replica of Rogan’s voice was produced using a text-to-speech deep learning system called RealTalk, which generates life-like speech using only text inputs, according to the developers.

Like this article? Get the latest from The Mind Unleashed in your inbox. Sign up right here.

Typos, corrections and/or news tips? Email us at Contact@TheMindUnleashed.com

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.