The Open Source Initiative (OSI) has released an updated draft definition of what constitutes open-source AI and says Meta’s models don’t qualify despite the company’s claims.
Mark Zuckerberg has been vocal about Meta’s commitment to what he says is open-source AI. However, while models like Llama 3.1 are less opaque than the proprietary models from OpenAI or Google, discussions in the OSI community suggest Meta is using the term loosely.
At an online public town hall event on Friday, the OSI discussed the criteria it believes a truly open-source AI model should conform to. The OSI refers to these criteria as “4 Freedoms” and says an open-source AI “is an AI system made available under terms and in a way that grant the freedoms to:
- Use the system for any purpose and without having to ask for permission.
- Study how the system works and inspect its components.
- Modify the system for any purpose, including to change its output.
- Share the system for others to use with or without modifications, for any purpose.”
To be able to modify an AI model, the OSI’s open AI definition says the weights and source code should be open, and the training data set should be available.
Meta’s license imposes some restrictions on how its models can be used and it has declined to release the training data it used to train its models. If you accept that the OSI is the custodian of what “open-source” means, then the implication is that Meta distorts the truth when it calls its models “open”.
The OSI is a California public benefit corporation that relies on community input to develop open-source standards. Some in that community have accused Mark Zuckerberg of “open washing” Meta’s models and bullying the industry into accepting his version rather than the OSI’s definition.
Chairman of Open Source Group Japan, Shuji Sado said “It’s possible that Zuckerberg has a different definition of Open Source than we do,” and suggested that the unclear legal landscape around AI training data and copyright could be the reason for this.
Open Source AI Definition – Weekly update September 23 https://t.co/flbb3yGCmx
— Open Source Initiative @[email protected] (@OpenSourceOrg) September 23, 2024
Words matter
This might all sound like an argument over semantics but, depending on the definition the AI industry adopts, there could be serious legal consequences.
Meta has had a tough time navigating EU GDPR laws over its insatiable hunger for users’ social media data. Some people claim that Meta’s loose definition of “open-source AI” is an attempt to skirt new laws like the EU AI Act.
The Act provides a limited exception for general-purpose AI models (GPAIMs) released under open-source licenses. These models are exempt from certain transparency obligations although they still have to provide a summary of the content used to train the model.
On the other hand, the proposed SB 1047 California AI safety bill disincentivizes companies like Meta from aligning their models with the OSI definition. The bill mandates complex safety protocols for “open” models and holds developers liable for harmful modifications and misuse by bad actors.
SB 1047 defines open-source AI tools as “artificial intelligence model[s] that [are] made freely available and that may be freely modified and redistributed.” Does that mean that an AI model that can be fine-tuned by a user is “open” or would the definition only apply if the model ticks all the OSI boxes?
For now, the vaguery allows Meta the marketing benefits and room to negotiate some legislation. At some point, the industry will need to commit to a definition. Will it be defined by a big tech company like Meta or by a community-driven organization like the OSI?