OpenAI recently revealed GPT-4o (omni), an upgraded version of its ChatGPT model, during a major announcement event. This new iteration, as the name suggests, brings substantial enhancements over GPT-4, particularly in text, vision, and audio capabilities.
Mira Murati, OpenAI’s Chief Technology Officer, highlighted that GPT-4o shows significant speed improvements along with advancements in text, vision, and audio processing. Murati shared these updates during a livestream event on Monday. Notably, GPT-4o is accessible to all users for free, with paid users enjoying expanded capacity limits up to five times more than free users.
Initially, GPT-4o has been rolled out with text and image capabilities, with plans to introduce additional features gradually, according to a recent blog post from OpenAI.
Unlike its predecessor GPT 3.5, GPT-4o is fully multimodal, enabling it to handle input in various forms such as text, voice, and images, as explained by OpenAI CEO Sam Altman. Altman also mentioned that developers keen on experimenting with GPT-4o can access an API that is twice as fast and offered at half the price of GPT-4 Turbo.
ChatGPT’s future updates are expected to equip it with voice assistant functionalities akin to the AI in the movie “Her” (2013), allowing real-time responses and environmental awareness via camera input. However, the current voice mode has limitations, including responding to one prompt at a time and relying solely on audio input.
Overall, GPT-4o marks a significant leap forward in AI capabilities, promising a more immersive and versatile experience for users across text, voice, and visual interactions.