OpenAI has added vision capabilities to ChatGPT’s Advanced Voice Mode, marking a significant enhancement. ChatGPT can now assess and respond to live video input while also sharing displays during talks. While the capability was introduced with GPT-4o, only the audio feature was currently operational. Users may now chat with ChatGPT using their smartphone camera, which means the chatbot can see what they see.
This upgrade allows ChatGPT to interpret and debate the visual context in real-time. The feature operates through a new video icon in the mobile app, with screen sharing accessible via a second menu option. The latest improvements are already available to ChatGPT Plus, Pro, and Team subscribers, with Enterprise and Edu users receiving access in January.
The AI powerhouse has also created a holiday voice option that allows customers to communicate with Santa for a limited time through early January.
Also Read OpenAI introduces GPT-4o, a strong free-for-all AI model that includes vision, text, and voice.
OpenAI CPO Kevin Weil and other members of the AI startup demonstrated how ChatGPT can assist with making pour-over coffee. During the demo, members focused the camera on the action, and the bot demonstrated that it grasped the premise of the coffee maker by explaining brewing to the team. The audience also showed how ChatGPT can now enable screen-sharing by learning an open letter on a smartphone and recognizing Weil as displaying a Santa beard.