OpenAI’s board announced the formation of a Safety and Security Committee which is tasked with making recommendations on critical safety and security decisions for all OpenAI projects.
The committee is led by directors Bret Taylor (Chair), Adam D’Angelo, Nicole Seligman, and OpenAI’s CEO Sam Altman.
Aleksander Madry (Head of Preparedness), Lilian Weng (Head of Safety Systems), John Schulman (Head of Alignment Science), Matt Knight (Head of Security), and Jakub Pachocki (Chief Scientist) will also be on the committee.
OpenAI’s approach to AI safety has faced both external and internal criticism. Last year’s firing of Altman was supported by then-board member Ilya Sutskever and others, ostensibly over safety concerns.
Last week Sutskever and Jan Leike from OpenAI‘s “superalignment” team left the company. Leike specifically noted safety issues as his reason for leaving, saying the company was letting safety “take a backseat to shiny products”.
Yesterday, Leike announced that he was joining Anthropic to work on oversight and alignment research.
I’m excited to join AnthropicAI?ref_src=twsrc%5Etfw”>@AnthropicAI to continue the superalignment mission!
My new team will work on scalable oversight, weak-to-strong generalization, and automated alignment research.
If you’re interested in joining, my dms are open.
— Jan Leike (@janleike) May 28, 2024
Now Altman is not only back as CEO, but also sits on the committee responsible for highlighting safety issues. Former board member Helen Toner’s insights into why Altman was fired makes you wonder how transparent he’ll be about safety issues the committee discovers.
Apparently the OpenAI board found out about the release of ChatGPT via Twitter.
❗EXCLUSIVE: “We learned about ChatGPT on Twitter.”
What REALLY happened at OpenAI? Former board member Helen Toner breaks her silence with shocking new details about Sam Altman‘s firing. Hear the exclusive, untold story on The TED AI Show.
Here’s just a sneak peek: pic.twitter.com/7hXHcZTP9e
— Bilawal Sidhu (@bilawalsidhu) May 28, 2024
The Safety and Security Committee will use the next 90 days to evaluate and further develop OpenAI’s processes and safeguards.
The recommendations will be put to OpenAI’s board for approval and the company has committed to publishing the adopted safety recommendations.
This push for additional guardrails comes as OpenAI says it has started training its next frontier model which it says will “bring us to the next level of capabilities on our path to AGI.”
No expected release date was offered for the new model but training alone will probably take weeks if not months.
In an update on its approach to safety published after the AI Seoul Summit, OpenAI said “We won’t release a new model if it crosses a “Medium” risk threshold from our Preparedness Framework, until we implement sufficient safety interventions to bring the post-mitigation score back to “Medium”.”
It said that more than 70 external experts were involved in red teaming GPT-4o before its release.
With 90 days to go before the committee presents its findings to the board, only recently started training, and a commitment to extensive red teaming, it looks like we’ve got a long wait before we finally get GPT-5.
Or do they mean they’ve just started training GPT-6?