OpenAI, the company behind ChatGPT, is developing a new approach to its artificial intelligence models under the project code-named “Strawberry,” as revealed by a source familiar with the matter and internal documents reviewed by Reuters. This initiative comes as OpenAI, backed by Microsoft, strives to demonstrate that its models can exhibit advanced reasoning abilities. According to a recent internal document seen by Reuters in May, teams within OpenAI are actively working on the Strawberry project. The document outlines a research plan for how OpenAI intends to utilize Strawberry, though the exact date of the document remains uncertain, and the source described it as a work in progress. The secrecy surrounding Strawberry’s workings is maintained even within OpenAI.
The document details a project that aims to enable OpenAI’s AI models to not only respond to queries but also to plan and navigate the internet autonomously for what OpenAI refers to as “deep research.” This capability has so far eluded AI models, according to interviews with over a dozen AI researchers. An OpenAI spokesperson stated, “We want our AI models to see and understand the world more like we do. Continuous research into new AI capabilities is a common practice in the industry, with a shared belief that these systems will improve in reasoning over time.” The spokesperson did not directly comment on Strawberry.
Previously known as Q*, Strawberry was already considered a breakthrough within the company, as reported by Reuters last year. Sources have described seeing Q* demos earlier this year, capable of answering complex science and math questions beyond the reach of current commercial models. OpenAI has internally tested AI models that scored over 90% on a MATH dataset, though it is unclear if this is related to the Strawberry project. At an internal all-hands meeting, OpenAI showcased a research project with new human-like reasoning skills, according to Bloomberg.
OpenAI aims to significantly enhance its AI models’ reasoning capabilities through Strawberry, which involves a specialized post-training method. Researchers believe that improving reasoning is crucial for AI to achieve human or super-human-level intelligence. Large language models can already summarize texts and compose prose quickly, but they often struggle with common sense problems. AI researchers generally agree that reasoning involves enabling AI to plan ahead, reflect on physical world functions, and solve multi-step problems reliably. OpenAI’s Strawberry project is seen as a key component in overcoming these challenges.
In recent months, OpenAI has hinted at imminent releases of technology with advanced reasoning capabilities to developers and other parties. Strawberry includes a specialized post-training method for OpenAI’s generative AI models, adapting them for specific performance enhancements after initial training. This method is similar to Stanford’s 2022 “Self-Taught Reasoner” (STaR), which allows AI models to iteratively create their own training data. OpenAI’s document mentions that Strawberry aims to enable long-horizon tasks, requiring models to plan and perform actions over extended periods. OpenAI is creating, training, and evaluating these models on a “deep-research” dataset, aiming to conduct autonomous web-based research with the help of a computer-using agent.