ChatGPT’s Live Video Feature Spotted, Might Be Released Soon

ChatGPT might soon gain the ability to answer queries after looking through your smartphone’s camera. As per a report, evidence for the Live Video feature, which is part of OpenAI’s Advanced Voice Mode, was spotted in the latest ChatGPT for Android beta app. This capability was first demonstrated in May during the AI firm’s Spring Updates event. It allows the chatbot to access the smartphone’s camera and answer queries about the user’s surroundings in real-time. While the emotive voice capability was released a couple of months ago, the company has so far not announced a possible release date for the Live Video feature.

ChatGPT Live Video Feature Discovered on Latest Beta Release

An Android Authority report detailed the evidence of the Live Video feature, which was found during an Android package kit (APK) teardown process of the app. Several strings of code relating to the capability were seen in the ChatGPT for Android beta version 1.2024.317.

Notably, the Live Video feature is part of ChatGPT’s Advanced Voice Mode, and it lets the AI chatbot process video data in real-time to answer queries and interact with the user in real-time. With this, ChatGPT can look into a user’s fridge and scan ingredients and suggest a recipe. It can also analyse the user’s expressions and try to gauge their mood. This was coupled with the emotive voice capability which lets the AI speak in a more natural and expressive manner.

As per the report, multiple strings of code relating to the feature were seen. One such string states, “Tap the camera icon to let ChatGPT view and chat about your surroundings,” which is the same description OpenAI gave for the feature during the demo.

Other strings reportedly include phrases such as “Live camera” and “Beta”, which highlight that the feature can work in real-time and that the under-development feature will likely be released to beta users first.

Another string of code also includes an advisory for users to not use the Live Video feature for live navigation or decisions that can impact users’ health or safety.

While the existence of these strings does not point towards the release of the feature, after a delay of eight months, this is the first time a piece of conclusive evidence that the company is working on the feature has been found. Earlier, OpenAI claimed that the feature was being delayed in order to protect users.

Notably, Google DeepMind also demonstrated a similar AI vision feature at the Google I/O event in May. Part of Project Astra, the feature gives Gemini to see the user’s surroundings using the device’s camera.

In the demo, Google’s AI tool could correctly identify objects, deduce current weather conditions, and even remember objects it saw earlier in the live video session. So far, the Mountain View-based tech giant has also not given a timeline on when this feature might be introduced.