OpenAI says Voice Engine might be too risky to release

April 1, 2024

  • OpenAI revealed Voice Engine which clones a human voice from just 15 seconds of speech
  • Voice Engine was tested by a small group of partners but OpenAI is reluctant to release it publicly
  • OpenAI adds an audio watermark to cloned Voice Engine audio but says more safety measures are needed

OpenAI says it ran a small-scale test of its new voice cloning product Voice Engine with a few select partners. The results show promising applications for the tech, but safety concerns may keep it from being released.

OpenAI says that Voice Engine can clone a human’s voice based on a single 15-second recording of their voice. The tool can then generate “natural-sounding speech that closely resembles the original speaker.”

Once cloned, Voice Engine can turn text inputs into audible speech using “emotive and realistic voices.” The tool’s capability makes exciting applications possible but raises serious safety issues too.

Promising use cases

OpenAI started testing Voice Engine late last year to see how a small group of select participants could use the tech.

Some of the examples of how Voice Engine test partners used the product are:

  • Adaptive teaching – Age of Learning used Voice Engine to provide reading assistance to children, create voice-over content for learning material, and provide personalized verbal responses to interact with students.
  • Translating content – HeyGen used Voice Engine for video translation so product marketing and sales demos could reach a wider market. The translated audio retains the person’s native accent. So, when a native French speaker’s audio is translated into English you’d still hear their French accent.
  • Provide wider social services – Dimagi trains health workers in remote settings. It used Voice Engine to give training and interactive feedback to health workers in underserved languages.
  • Supporting non-verbal people – Livox enables non-verbal people to communicate using alternative communication devices. Voice Engine allows these people to choose a voice that best represents them rather than something that sounds more robotic.
  • Helping patients recover their voice – Lifespan piloted a program offering Voice Engine to people with speech impairments due to cancer or neurologic conditions.

Voice Engine isn’t the first AI voice cloning tool, but the samples in OpenAI’s blog post point to it representing the state-of-the-art and may even be better than ElevenLabs.

Here’s just one example of the natural inflection and emotive characteristics it can generate.

Safety concerns

OpenAI said it was impressed with the use cases test participants came up with but more safety measures would need to be in place before the company decided on “whether and how to deploy this technology at scale.”

OpenAI says technology that can accurately reproduce someone’s voice “has serious risks, which are especially top of mind in an election year.” Fake Biden robocalls and the fake video of Senate candidate Kari Lake are cases in point.

In addition to the clear restrictions in its general usage policies, the participants in the trial had to have “explicit and informed consent from the original speaker” and were not allowed to build a product that enabled people to create their own voices.

OpenAI says it implemented other safety measures including an audio watermark. It didn’t explain exactly how but said it could perform “proactive monitoring” of Voice Engine’s use.

Some other big players in the AI industry are also worried about this kind of tech getting out into the wild.

What’s next?

Will the rest of us get to play around with Voice Engine? It’s unlikely, and maybe that’s a good thing. The potential for malicious use is huge.

OpenAI is already recommending that institutions like banks phase out voice authentication as a security measure.

Voice Engine has an embedded audio watermark, but OpenAI says more work is needed to identify when audiovisual content is AI-generated.

Even if OpenAI decides not to release Voice Engine, others will. The days of being able to trust your eyes and ears are history.

Join The Future


SUBSCRIBE TODAY

Clear, concise, comprehensive. Get a grip on AI developments with DailyAI

Eugene van der Watt

Eugene comes from an electronic engineering background and loves all things tech. When he takes a break from consuming AI news you'll find him at the snooker table.

×

FREE PDF EXCLUSIVE
Stay Ahead with DailyAI

Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.

*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions