Recent events at OpenAI involving the reinstatement of CEO Sam Altman have sparked discussions about a confidential project within the company that holds promise in revolutionizing problem-solving using a novel approach.
According to two separate reports, this undisclosed venture, referred to as Q*, garnered attention for its potential to tackle complex problems in a new and powerful manner. Reuters, citing an anonymous source, mentioned that with extensive computational resources, the project managed to solve specific mathematical problems, albeit at a level akin to elementary school students. Despite the relatively basic tasks, this achievement sparked optimism among researchers regarding the future potential of Q*.
Another report by The Information highlighted Q* as a groundbreaking initiative poised to pave the way for significantly more potent artificial intelligence models. However, the rapid progress of this project raised concerns among some AI safety-focused researchers, citing the pace of development as worrisome.
Speculation around Q* intensified over the Thanksgiving weekend, partly fueled by its cryptic name, prompting discussions and creating an aura of mystery around a project shrouded in secrecy. While reports suggested that researchers expressed worries about Q*'s potential power to the board that removed Altman, conflicting sources claim otherwise.
Altman indirectly acknowledged the project's existence when asked about Q* during an interview with the Verge but refrained from providing specific details, stating, “No particular comment on that unfortunate leak.”
The nature of Q* remains elusive, but analysis of initial reports and consideration of current AI challenges hint at a potential link to OpenAI's May announcement. This earlier project, involving Ilya Sutskever, OpenAI’s chief scientist, and co-founder, focused on refining large language models' logical accuracy through a technique termed "process supervision."
Process supervision involves training AI models to systematically solve problems, aiming to rectify errors commonly made by large language models, particularly in basic math queries. This approach showcased the potential to enhance these models' problem-solving capabilities significantly.
Experts like Andrew Ng see the improvement of large language models as the logical progression toward enhancing their utility. While acknowledging the models' deficiencies in math, Ng draws parallels to human capabilities with pen and paper, suggesting that fine-tuning these models with memory can potentially boost their arithmetic abilities.
The enigmatic name Q* might draw inspiration from Q-learning, a reinforcement learning method used to teach algorithms problem-solving via feedback. Additionally, references to the A* search algorithm, known for finding optimal paths, might offer clues to Q*'s nature.
The use of computer-generated data in training new models, rather than real-world data like text or images from the internet, as mentioned in The Information's report, suggests a novel approach involving synthetic training data. This method has emerged as a means to train more robust AI models.
Speculations by experts like Subbarao Kambhampati propose that Q* could leverage extensive synthetic data combined with reinforcement learning to train large language models for specific tasks, such as elementary arithmetic. However, there remain uncertainties about its potential for generalized problem-solving across all mathematical realms.
Further conjecture suggests that Q* might employ reinforcement learning and other methodologies to enhance large language models' capacity to reason through tasks step-by-step, potentially improving their prowess in solving mathematical challenges. Yet, whether this would inadvertently imply AI systems evading human control remains uncertain.