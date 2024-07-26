Key Takeaways GPT-4o Voice Mode will enhance the natural feel of talking to ChatGPT.

The new features include reduced response time and different tones of voice.

Initial rollout to a select group of ChatGPT Plus subscribers, with wider release expected in fall.

After a longer than expected wait, Sam Altman of OpenAI has indicated in a reply on X that GPT-4o's new voice features will finally start rolling out next week. However, this alpha release will be limited to a small set of ChatGPT Plus subscribers initially, with the features likely to see a wider release sometime in the fall.

Back in May, OpenAI showcased GPT-4o, it's new model. The demonstration included some impressive new capabilities, such as the ability to respond to information from a real-time video feed, and new voice features that would make talking to GPT-4o seem more like speaking to a human. When GPT-4o was released, the voice capabilities were missing, with messages in the app indicating that the new Voice Mode features would be rolling out soon. It now seems that the rollout is finally going to start.

Related SearchGPT explained: What it is and how you can be the first to try it OpenAI has long been rumored to be working on a competitor to Google Search, and now it's finally here.

GPT-4o Voice will make talking to ChatGPT feel much more natural

Voice will be more capable and will have some additional abilities

Even before the launch of GPT-4o, you could already talk to GPT-4 in Voice Mode, but one of the big drawbacks is that it's hard to have what feels like a natural conversation when there is an average delay of 5.4 seconds. You speak aloud, then have to watch the think bubble animation for a few seconds before you get any response.

The new GPT-4o Voice Mode will cut the average response time down to just 320 milliseconds and can go as low as 232 milliseconds. This allows you to have what feels like an instant back-and-forth conversation with GPT-4o. In the demonstrations during the announcement, the responses were impressively fast. It's also possible to interrupt the response just by speaking again; the voice response will stop and GPT-4o will start listening again.

If the capabilities in the wild are as impressive as they are in the demonstrations, then it really will make talking to GPT-4o feel like talking to another person.

Speed isn't the only change, however. It's possible to get GPT-4o to speak in different tones of voice or in other different ways. Demonstration videos show GPT-4o speaking in a sarcastic tone of voice, speaking like a sportscaster, counting to ten at different speeds, and even singing Happy Birthday. If the capabilities in the wild are as impressive as they are in the demonstrations, then it really will make talking to GPT-4o feel like talking to another person.

Voice Mode in GPT-4o is also capable of real-time translation. For example, it's possible for one person to speak to GPT-4o in one language and a second person to speak to GPT-4o in a different language. GPT-4o will then repeat each phrase in the opposite language, allowing two people who don't speak the same language to hold a conversation.

You'll probably have to wait a little longer for GPT-4o Voice Mode

The new features are only being released to a small group of ChatGPT Plus users

The initial release of the new features has been a long time coming. OpenAI stated in May that they would be rolled out "within the coming weeks" but the number of weeks since the announcement has already hit double figures. However, the wait is almost over, for a small handful of people at least. As well as the conformation from Sam Altman on X, the message within the ChatGPT app also states that Open AI will "begin the alpha with a small group of Plus users in late July."

This small initial rollout means that even if you're a ChatGPT Plus user, it's highly unlikely that you're going to get access to the new Voice Mode features next week. However, the message also states that "the plan is for all Plus users to have access in the fall" so hopefully, the rest of us won't have too much longer to wait. One thing that is certain; when the new Voice Mode does drop, it's not going to sound anything like Scarlett Johansson.