A study conducted by the Stanford Internet Observatory identified more than 3,200 images of suspected child sexual abuse in the LAION database, a large-scale index of online images and captions used to train AI image generators like Stable Diffusion.
In collaboration with the Canadian Centre for Child Protection and other anti-abuse organizations, the Stanford team reviewed the database and reported their findings to law enforcement. LAION contains billions of images obtained through unguided web-scraping.
Over 1,000 of these images were subsequently confirmed as child sexual abuse material. The information was published in a paper, “Identifying and Eliminating CSAM in Generative ML Training Data and Models.”
The researchers stated, “We find that having possession of a LAION‐5B dataset populated even in late 2023 implies the possession of thousands of illegal images,” underscoring the nature of internet-scraped datasets and their completely unverified and unchecked content.
AI image generators have been implicated in a number of child sex abuse and pornography cases. A North Carolina man was recently imprisoned for 40 years after being found in possession of AI-generated child abuse imagery, providing perhaps the first example in the world of someone being tried for such a crime.
LAION, an abbreviation for Large-scale Artificial Intelligence Open Network, promptly removed its datasets from public access.
LAION then issued a statement emphasizing its zero-tolerance policy for illegal content and its commitment to ensuring the safety of its datasets before republishing them.
Since this data was used to train popular models, they’ll be able to ‘use’ it to generate entirely new content, which is already happening. An investigation found that people are creating these types of images and selling them on sites like Patreon.
Researchers noted that AI tools are also likely synthesizing criminal content by merging images from separate categories of online images – adult pornography and benign photos of children.
David Thiel, the chief technologist at the Stanford Internet Observatory and the report’s author, stressed how these issues arise, pointing to the rushed deployment of many AI projects in the competitive tech landscape.
He stated in an interview, “Taking an entire internet-wide scrape and making that dataset to train models is something that should have been confined to a research operation, if anything, and is not something that should have been open-sourced without a lot more rigorous attention.”
The Stanford Internet Observatory has urged those building training sets based on LAION‐5B to either delete them or collaborate with intermediaries to cleanse the material. It also recommends making older versions of Stable Diffusion, particularly those known for generating explicit imagery, less accessible online.
Stability AI stated that they only host filtered versions of Stable Diffusion and have taken proactive steps to mitigate risks of misuse.
Lloyd Richardson, the IT director at the Canadian Centre for Child Protection, commented on the irreversible nature of the problem, saying, “We can’t take that back. That model is in the hands of many people on their local machines.”
Past research in LAION
Stanford’s study is not the first to launch investigations into databases like LAION.
In 2021, computer science researchers Abeba Birhane, Vinay Uday Prabhu, and Emmanuel Kahembwe published “Multimodal datasets: misogyny, pornography, and malignant stereotypes,” which analyzed the LAION-400M image dataset.
Their paper states, “We found that the dataset contains troublesome and explicit images and text pairs of rape, pornography, malign stereotypes, racist and ethnic slurs, and other extremely problematic content.”
This study also found that the labels used for images often mirrored or represented conscious and unconscious bias, which, in turn, inflicts bias onto the AI models that data is used to train.
Numerous past research has examined the link between biased datasets and biased model outputs, with impacts including sexist or gender-biased models rating women’s skills as lower value than men’s, discriminatory and inaccurate facial recognition systems, and even failures in medical AI systems designed to examine potentially cancerous skin lesions in those with darker skin.
So, in addition to abusive child-related material facilitating illicit uses of AI models, issues in datasets manifest themselves throughout the machine learning lifecycle to sometimes eventually threaten people’s freedom, social standing, and health.
Reacting to the Stanford study on X, a co-author of the above paper and others examining LAION and the related impacts of underlying data on model outputs, Abeba Birhane pointed out that Stanford hadn’t sufficiently discussed past research on this topic.
me & my collaborators have done the most extensive research of the LAION datasets (3 academic papers & the first to investigate dataset in 2021 showing misogyny, pornography, & malignant stereotypes)
yet, the Stanford study has not cited us once. this is academic misconduct https://t.co/pzhL8b3wBt
— Abeba Birhane (@Abebab) December 21, 2023
Birhane stresses that this is a systemic issue, with academic strongholds like Stanford tending to depict their research as pioneering when this often isn’t the case.
For Birhane, this indicates the broader problem of ‘erasure’ in academia, where research conducted from those with diverse backgrounds and outside of the US techscape is seldom given fair credit.
In October, we published an article on AI colonialism, demonstrating how AI knowledge, assets, and datasets are hyperlocalized in a select few regions and academic institutions.
In combination, linguistic, cultural, and ethnic diversity are becoming progressively and systematically unrepresented in the industry, both in terms of research, data and, in turn, model outputs.
For some in the industry, this is a ticking time bomb. When training extremely powerful ‘superintelligent’ models or artificial general intelligence (AGI), the presence of such content in datasets could be far-reaching.
As Birhane and co-researchers pointed out in their study: “There is a growing community of AI researchers that believe that a path to Artificial General Intelligence (AGI) exists via the training of large AI models with “all available data.””
“The phrase “all available data” often encompasses a large trove of data collected from the WWW (i.e. images, videos, and text)…[as seen] this data includes images and text that grossly misrepresent groups such as women, embodies harmful stereotypes, overwhelmingly sexualize Black women, and fetishize Asian women. Additionally, large scale internet collected datasets also capture illegal content, such as images of sexual abuse, rape and non-consensual explicit images.”
AI companies react to the Stanford study
OpenAI clarified that it did not use the LAION database and has fine-tuned its models to refuse requests for sexual content involving minors.
Google, which used a LAION dataset to develop its text-to-image Imagen model, decided against making it public after an audit revealed a range of inappropriate content.
The legal risks AI developers expose themselves to when using datasets indiscriminately and without proper due diligence are potentially enormous.
As Stanford suggests, developers need to be more careful about their responsibilities when creating AI models and products.
Beyond that, there is a critical need for AI companies to better engage with research communities and model developers to stress the risk of exposing models to such data.
As previous research has shown, ‘jailbreaking’ models to coax them into bypassing guardrails are straightforward.
For example, what might happen if someone were to jailbreak an extremely intelligent AGI system trained on child abuse, discriminatory material, torture, and so on?
It’s a question the industry finds awkward to answer. Constantly referring to guardrails that are repeatedly exploited and manipulated is a stance that might wear thin.