A Text Reading and Human Voice Cloning Audio Feature is Unveiled by OpenAI

Approximately ten developers have been provided with early samples and use cases from the company’s text-to-speech paradigm, Voice Engine, according to a spokeswoman.

Sharing preliminary test findings for a feature that can read words aloud in a convincing human voice, OpenAI exemplifies a new area of artificial intelligence research and raises concerns about deepfake.

Approximately ten developers have been provided with early samples and use cases from the company’s text-to-speech paradigm, Voice Engine, according to a spokeswoman.

After briefing reporters earlier this month, OpenAI decided against rolling out the tool to a wider audience.

After hearing from stakeholders including lawmakers, industry experts, educators, and creatives, OpenAI chose to pull back the release, according to a spokeswoman. As stated in the previous press briefing, the corporation had intended to offer the tool to up to one hundred developers via an application process.

In a blog post published on Friday, the company acknowledged the risks associated with creating speech that sounds like human voices. These concerns are particularly prominent during an election year, according to the post. “We are engaging with US and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build.”

False voices have previously been created using other forms of artificial intelligence. Fears about artificial intelligence were heightened in the lead-up to crucial worldwide elections in January when a fake but convincingly convincing phone call claiming to be from President Joe Biden urged New Hampshire residents not to cast ballots in the primary.

Voice Engine can generate human-sounding speech, down to the individual’s rhythm and intonations, which is a significant improvement over OpenAI’s earlier attempts at audio content generation. The program can accurately imitate a person’s voice using just fifteen seconds of recorded audio.

Bloomberg listened to a clip of OpenAI CEO Sam Altman, who used an AI-generated voice that sounded identical to his own speech, briefly discussing the technique while the tool was being demonstrated.

“If you have the right audio setup, it’s basically a human-caliber voice,” stated Jeff Harris, a product lead at OpenAI. “It’s a pretty impressive technical quality.” Harris pointed out, “There’s obviously a lot of safety delicacy around the ability to really accurately mimic human speech.”

The Norman Prince Neurosciences Institute of the not-for-profit health system Lifespan is utilizing technology to assist patients in regaining their voice, and they are one of the current developer partners using OpenAI’s service.

AThe business blog article cited a case where the technology was utilized to restore a young patient’s voice after she had lost it due to a brain tumor. The girl had previously recorded her speech for a school assignment, and the program was able to replicate her speech.

Transcribing the sounds it produces into several languages is another capability of OpenAI’s proprietary speech model. Because of this, it is helpful for audio firms such as Spotify Technology SA.

As part of its own trial program, Spotify has utilized the technology to translate podcasts from well-known hosts such as Lex Fridman. One of the many positive uses of the technology that OpenAI highlighted was its potential to increase diversity of voice in children’s educational media.

As part of the testing program, OpenAI needs its partners to sign usage agreements, get the original speaker’s consent before utilizing their voice, and tell listeners that the sounds they’re hearing are artificially generated. In order to tell if an audio file was generated by its program, the company is also adding a watermark that cannot be heard.

It has been announced that OpenAI is seeking expert comments before determining whether to offer the feature to a wider audience. “It’s important that people around the world understand where this technology is headed, whether we ultimately deploy it widely ourselves or not,” the business wrote on its website.

Additionally, OpenAI expressed its desire that the software sample “motivates the need to bolster societal resilience” in the face of the threats posed by increasingly sophisticated AI systems. In order to better protect customers’ financial data and account access, the business has urged financial institutions to stop using voice authentication.

It is also aiming to raise awareness regarding misleading AI material and encourage the development of methods to distinguish between actual and AI-generated sounds.

1 thought on “A Text Reading and Human Voice Cloning Audio Feature is Unveiled by OpenAI”

Leave a Comment