OpenAI logo with spiraling pastel colors (Image Credits: Bryce Durbin / TechCrunch)

ChatGPT Gains Real-Time Video Comprehension Following OpenAI’s Initial Demonstration Seven Months Ago

OpenAI has unleashed an exciting upgrade to ChatGPT, unveiling its long-anticipated real-time video capabilities! Demonstrated seven months ago, this new feature has finally made its debut. To celebrate, OpenAI revealed during a Thursday livestream that ChatGPT’s Advanced Voice Mode, known for its human-like conversational flair, now comes equipped with vision enhancements.

For those subscribed to ChatGPT Plus, Team, or Pro, this means a remarkable new capability: users can now aim their smartphones at various objects and receive feedback from ChatGPT almost instantaneously. But it doesn’t stop there. Advanced Voice Mode with vision can also interpret what’s happening on your device’s screen through screen sharing. This is incredibly useful for tasks like navigating complicated settings menus or even tackling tough math problems.

To unlock this feature, just hit the voice icon near the ChatGPT chat bar and then click on the video icon located at the bottom left to initiate video mode. If you need to share your screen, tap the three dots menu and choose “Share Screen.” As of today, OpenAI has started rolling out Advanced Voice Mode with vision, aiming for full availability within a week. However, there’s a catch: ChatGPT Enterprise and Edu subscribers will have to wait until January, and users in certain regions like the EU, Switzerland, Iceland, Norway, and Liechtenstein are still pending a release timeline.

During a captivating demo on CNN’s “60 Minutes,” OpenAI president Greg Brockman showcased Advanced Voice Mode with vision in action, as it quizzed Anderson Cooper on anatomy by recognizing his blackboard drawings. “The location is spot on,” ChatGPT noted about the brain being drawn, though it humorously critiqued the shape as needing some refinement.

While impressive, the feature isn’t without its flaws. During this same demo, it stumbled on a geometry problem, hinting at occasional misinterpretations. The road to launching Advanced Voice Mode with vision wasn’t smooth, facing delays partly due to its announcement before being fully ready. Initially promised for a spring release, additional development time pushed its debut to the early fall without the visual component, until now.

Alongside the vision-boosted Voice Mode, OpenAI has also introduced a holiday treat: “Santa Mode.” This festive feature lets users enjoy ChatGPT with Santa’s voice, accessible by tapping the snowflake icon next to the prompt bar in the app.

With these innovative updates, ChatGPT not only becomes more interactive but continues to blend cutting-edge technology with playful features, setting the stage for even more immersive user experiences.