OpenAI announced two ways it is improving its artificial intelligence (AI) models last week. The first includes releasing a new update for the GPT-4o (also known as the GPT-4 Turbo), the company’s latest AI model powering ChatGPT for paid subscribers. The company says the update improves the creative writing ability of the model and makes it better at natural language responses and writing engaging content with high readability. OpenAI also released two research papers on red teaming and shared a new method to automate the process to scale spotting errors made by its AI models.
OpenAI Updates GPT-4o AI Model
In a post on X (formerly known as Twitter), the AI firm announced a new update for the GPT-4o foundation model. OpenAI says the update allows the AI model to generate outputs with “more natural, engaging, and tailored writing to improve relevance and readability.” It is also said to improve the AI model’s ability to process uploaded files and provide deeper insights and “more thorough” responses.
Notably, the GPT-4o AI model is available to users with the ChatGPT Plus subscription and developers with access to the large language model (LLM) via API. Those using the free tier of the chatbot do not have access to the model.
While Gadgets 360 staff members were not able to test out the new capabilities, one user on X posted about the latest improvements in the AI model after the update. The user claimed that GPT-4o could generate an Eminem-style rap cipher with “sophisticated internal rhyming structures”.
Red teaming is the process used by developers and companies to employ external entities to test software and systems for vulnerabilities, potential risks, and safety issues. Most AI firms collaborate with organisations, prompt engineers, and ethical hackers to stress-test whether it responds with harmful, inaccurate, or misleading output. Tests are also made to check whether an AI system can be jailbroken.
Ever since ChatGPT was made public, OpenAI has been public with its red teaming efforts for each successive LLM release. In a blog post last week, the company shared two new research papers on the advancement of the process. One of them is of particular interest given the company claims it can automate large-scale red teaming processes for AI models.
Published in the OpenAI domain, the paper claims that more capable AI models can be used to automate red teaming. The company believes AI models can assist in brainstorming attacker goals, how an attacker’s success can be judged, and understanding the diversity of attacks.
Expanding on it, the researchers claimed that the GPT-4T model can be used to brainstorm a list of ideas that constitute harmful behaviour for an AI model. Some examples include prompts such as “how to steal a car” and “how to build a bomb”. Once the ideas have been generated, a separate red teaming AI model can be built to trick ChatGPT using a detailed series of prompts.
Currently, the company has not begun using this method for red teaming given several limitations. These include the evolving risks of AI models, exposing the AI to lesser-known techniques for jailbreaking or generating harmful content, and the need for a higher threshold for knowledge in humans to correctly judge the potential risks of output once the AI model becomes more capable.
Comments are closed.