OpenAI’s ChatGPT Unveils Major Upgrade, Adds Voice Conversation and Image Chat

[ad_1]

by Cindy Tan

Printed: September 25, 2023 at 9:19 am Up to date: September 25, 2023 at 9:22 am

by Victor Dey

Edited and fact-checked:

In Transient

OpenAI will probably be rolling out new voice and picture capabilities in ChatGPT over the subsequent two weeks.

These options will solely be out there to Plus and Enterprise customers.

OpenAI right this moment introduced that it’s beginning to roll out new voice and picture capabilities in ChatGPT. These new options enable customers to have a voice dialog with ChatGPT or present the chatbot photos.

This announcement follows claims by Reddit customers who asserted that they’d gained entry to OpenAI’s fashions and subsequently shared this info on the platform. Redditor FeltSteam described an AI mannequin with the working title of Arrakis, which reportedly permits customers to “enter any mixture of textual content, audio and video.”

I discovered some bizarre unconfirmed speculations about highly effective inner fashions on Reddit.

– Please take all with a grain of salt. –

Apparently, Two completely different customers declare they received entry to OpenAI’s inner fashions and are sharing info on reddit.

FeltSteam… pic.twitter.com/JRJH4xADZX

— Yam Peleg (@Yampeleg) September 25, 2023

With the brand new options, customers can interact in a back-and-forth dialog with ChatGPT utilizing their voices. They’ll additionally talk about photos with the chatbot. These options will probably be rolled out over the subsequent two weeks to Plus and Enterprise customers. The voice functionality is coming to iOS and Android as an opt-in whereas photos will probably be out there on all platforms.

To begin utilizing the voice operate, customers can head to Settings → New Options on the cellular app and decide into voice conversations. Subsequent, the consumer ought to faucet the headphone button discovered within the top-right nook of the house display screen and choose their most popular voice from a collection of 5 completely different voices.

“The brand new voice functionality is powered by a brand new text-to-speech mannequin, able to producing human-like audio from simply textual content and some seconds of pattern speech,” Open AI wrote in a weblog publish. “We collaborated with skilled voice actors to create every of the voices. We additionally use Whisper, our open-source speech recognition system, to transcribe your spoken phrases into textual content.”

To point out ChatGPT photos, the consumer ought to faucet the photograph button to both seize a picture or choose one. If they’re utilizing iOS or Android, they need to faucet the plus button earlier than continuing. Moreover, they’ll interact in discussions with a number of photos or use the information the chatbot.

OpenAI says that picture understanding is powered by multimodal GPT-3.5 and GPT-4. These fashions leverage their language reasoning talents to investigate a various array of visible content material, encompassing pictures, screenshots, and paperwork containing a mix of textual content and pictures.

OpenAI’s partnership with Spotify

Spotify additionally right this moment introduced its AI-powered voice translation function. The brand new function can translate podcasts into completely different languages, using the podcaster’s authentic voice.

In line with The Verge, this translation function depends on OpenAI’s voice transcription software, Whisper, which is ready to transcribe English speech and translate varied languages into English.

As a part of the pilot, the corporate has teamed up with podcasters Dax Shepard, Monica Padman, Lex Fridman, Invoice Simmons, and Steven Bartlett to create AI-driven voice translations in languages like Spanish, French, and German for particular catalog episodes and upcoming releases.

“We consider {that a} considerate method to AI will help construct deeper connections between listeners and creators, a key element of Spotify’s mission to unlock the potential of human creativity,” Ziad Sultan, VP of Personalization at Spotify, mentioned in an announcement.

Voice-translated episodes from pilot creators will probably be out there worldwide to Premium and Free customers.

Disclaimer

Any information, textual content, or different content material on this web page is supplied as common market info and never as funding recommendation. Previous efficiency shouldn’t be essentially an indicator of future outcomes.

The Belief Challenge is a worldwide group of reports organizations working to ascertain transparency requirements.

Cindy is a journalist at Metaverse Submit, overlaying matters associated to web3, NFT, metaverse and AI, with a concentrate on interviews with Web3 trade gamers. She has spoken to over 30 C-level execs and counting, bringing their precious insights to readers. Initially from Singapore, Cindy is now based mostly in Tbilisi, Georgia. She holds a Bachelor’s diploma in Communications & Media Research from the College of South Australia and has a decade of expertise in journalism and writing.Get in contact together with her through [email protected] with press pitches, bulletins and interview alternatives.

Extra articles

Cindy Tan