OpenAI has announced the potential of GPT-4 to modernize the task of content moderation, reducing the need for human oversight.
The internet is constantly moderated of harmful, hateful, or otherwise unacceptable content, and while algorithms already work behind the scenes to automate the process, human insight remains invaluable.
Content moderators are charged with this responsibility and have to sort through at-time traumatic content depicting suicide, torture, and murder.
OpenAI envisions a future where AI streamlines online content moderation per platform-specific guidelines, significantly reducing the pressure on human moderators.
In a blog post, they state, “We believe this offers a more positive vision of the future of digital platforms, where AI can help moderate online traffic according to platform-specific policy and relieve the mental burden of human moderators, of which there are likely hundreds of thousands worldwide.”
This is a salient topic, as OpenAI was recently embroiled in a scandal involving content moderators working for data services company Sama in Nairobi, Kenya.
Workers had to sort through graphic text content to improve ChatGPT’s ‘alignment’ – which describes modifying AI outputs to ‘desirable’ ethical, moral, and political boundaries – a highly subjective practice.
The content moderation team reported traumatic and unfair working conditions and petitioned the Kenyan government, culminating in a lawsuit.
OpenAI says GPT-4 could help craft personalized content policies and apply them to content at scale.
Contrasting GPT-4 with manual moderation, OpenAI highlighted the AI’s proficiency in offering consistent labeling and rapid feedback, explaining, “People may interpret policies differently or some moderators may take longer to digest new policy changes, leading to inconsistent labels. In comparison, LLMs are sensitive to granular differences in wording and can instantly adapt to policy updates to offer a consistent content experience for users.”
However, despite the potential of GPT-4 to alleviate the burden placed on content moderators, OpenAI admitted that fully automating the process is probably not possible, stating, “As with any AI application, results and output will need to be carefully monitored, validated, and refined by maintaining humans in the loop.”
How OpenAI intends to leverage GPT-4 for content moderation
Digital platforms face an ongoing challenge: moderating vast amounts of content quickly and accurately.
Historically, the heavy lifting was endured by human moderators with potentially disastrous psychological consequences, often coupled with low pay.
OpenAI seeks to leverage GPT-4 to automatically implement policies to limit harmful content. The company highlighted the following benefits:
- Speed: Using GPT-4, content policy changes that used to take months are now done in hours.
- Consistency: Human interpretation of content policies can vary, leading to inconsistencies. GPT-4 offers a standardized approach by adapting to policy adjustments.
- Mental well-being: Automating much of the content moderation process with GPT-4 can reduce the emotional strain on human moderators, who often encounter harmful or offensive content.
However, OpenAI admitted that content moderation moves quickly as humans constantly invent new ways of circumventing filters, e.g., using new slang terms to evade detection.
Moreover, bias remains a concern, as GPT-4’s decisions might reflect the biases of its training data. This could lead to blindspots or unpredictable treatment of some content.
It’s worth noting that the Kenyan content moderators were performing that task to help align ChatGPT.
So, even using AI to moderate content ultimately requires some level of human exposure.