OpenAI has recently disclosed that its board has the authority to override decisions made by the CEO regarding the release of new AI models, even if these have been deemed safe by the company’s leadership.
This information was detailed in a set of guidelines unveiled on Monday, outlining the company’s strategy to address potential extreme risks posed by its most advanced AI systems.
While the leadership team can initially decide on the release of a new AI system, the board retains the right to reverse such decisions.
Sam Altman, OpenAI’s CEO, was dismissed from his job and dramatically reinstated, highlighting a curious power dynamic between the company’s directors and executives.
In the aftermath, many speculated that Altman hadn’t been paying fair attention to model safety, that he’d been ‘swept up in his work,’ so to speak. Later reports cast doubt on that, including Microsoft President Brad Smith, who said he felt it unlikely that safety was the main motive.
There was also the no smaller matter that OpenAI had received a huge valuation, and its employees wanted to cash in their stock options.
Business Insider alleged that OpenAI employees used their open letter stating that they’d leave the company if Altman wasn’t reinstated as a bargaining chip. One employee even said he was a bad CEO, though these reports are unconfirmed.
OpenAI has now acted to quell fears they’re not taking AI safely, first by publishing results of their new “superalignment” experiment and also by increasing the ‘powers’ of their “preparedness” team.
OpenAI’s preparedness team, led by Aleksander Madry, currently on leave from MIT, continuously assess AI systems across various risk categories, including cybersecurity and threats related to chemical, nuclear, and biological domains.
This team aims to identify and mitigate any significant dangers associated with the technology. According to the guidelines, risks classified as “catastrophic” could lead to substantial economic damage or severe harm to many individuals.
Madry explained the process, stating, “AI is not something that just happens to us that might be good or bad. It’s something we’re shaping.” He further expressed his hope that other companies would adopt OpenAI’s guidelines for risk assessment.
His team, which was formed in October as part of three separate groups overseeing AI safety at OpenAI, will evaluate unreleased AI models and classify the perceived risks as “low,” “medium,” “high,” or “critical.” Only models rated “medium” or “low” will be considered for release.
We are systemizing our safety thinking with our Preparedness Framework, a living document (currently in beta) which details the technical and operational investments we are adopting to guide the safety of our frontier model development.https://t.co/vWvvmR9tpP
— OpenAI (@OpenAI) December 18, 2023
Speaking of their new policies on X, OpenAI said, “We are systemizing our safety thinking with our Preparedness Framework, a living document (currently in beta) which details the technical and operational investments we are adopting to guide the safety of our frontier model development.”
In addition to these measures, OpenAI has an internal safety advisory group that reviews Madry’s team’s findings and provides recommendations to Altman and the board.