How the Oversight Board uses Zentropi to study policy impact at scale
The Oversight Board used Zentropi to analyze a large dataset of content that potentially violated Meta’s policies on human exploitation. The tool helped the Board reduce the project timeline from weeks to just days, and to surface borderline content that resulted in stronger analysis and conclusions.
The Oversight Board’s Policy Research
The Oversight Board is an independent body whose mission is to improve how Meta treats people and communities around the world. It provides an independent check on Meta’s content moderation, making binding decisions on the most challenging content issues, and delivers policy recommendations that push Meta to improve its rules, act more transparently and treat all users fairly.
As part of this work, the Oversight Board conducts ongoing research to understand how its policy recommendations impact Meta’s information environment and user behavior. Zentropi has both made this research more efficient and surfaced new research directions.
Nuance at scale
In 2024, the Oversight Board selected a case involving a video in which a make-up artist from Iran prepares a 14-year-old girl for her wedding. In its decision, the Board agreed with Meta that the content should have been taken down under the Human Exploitation policy. As part of the decision, the Board recommended that Meta update the policy to specify clearly that child marriage is a form of forced marriage, and therefore forbidden on Meta’s platforms under the policy’s prohibition on content which recruits, facilitates or exploits people through forced marriage.
In August 2025, Meta amended its policies in response. Following that change, the Oversight Board wanted to understand whether the recommendation had a meaningful impact on the amount of violating content still discoverable on Meta’s platforms. To help answer that question, the Board used Zentropi labelers to analyze data obtained from the Meta Content Library.
Because the Board did not know in advance exactly where violating content would appear or what form it would take, the team intentionally used broad child-marriage-related search terms to cast a wide net. That approach improved coverage, but it also produced a large dataset with a substantial amount of non-violating content.
The challenge was not just identifying clear violations. It was also understanding how this content actually manifested in practice: across languages, across contexts, and across borderline cases that raised difficult policy questions. Zentropi helped them across several areas: saving weeks on the research timeline; shaping the scope of the analysis and conclusions; and multilingual capability that was essential for the project.
Scaled review on a tight timeline
The scope of the project was large in both dataset size and thematic range. That breadth was intentional. The team wanted to maximize the likelihood of capturing relevant content, even if that meant collecting a significant amount of noise alongside it.
Manually reviewing the entire dataset of over 100,000 posts would have taken weeks – using Zentropi shrunk that timeline down to a few days. What might otherwise have been too large to analyze became a workable research process.
Exploration, not just final labeling
The Board expected Zentropi to help them distinguish violating from non-violating content. What they did not fully anticipate was how useful it would be for surfacing borderline cases.
Through manual review of posts labeled as violating, along with posts below a 95% confidence threshold, they identified trends in content that aligned with the spirit of the policy but did not clearly violate Meta’s written rules.
Those cases raised important research questions: How should platforms treat content where a person’s age is ambiguous, but the bride is described as “young” or a “girl”? Should fictional stories glorifying underage marriage be allowed? Should there be any cultural or religious exceptions for posts expressing explicit support for marriage involving children?
These questions helped shape the direction of their analysis and sharpened their conclusions. In practice, Zentropi was not only a labeling tool for final reporting. It also became an exploratory tool that helped them understand how difficult policy issues actually manifest on the platform.
Multilingual review across scripts and regions
Most of the posts in the dataset were not in English. The content spanned multiple languages and scripts, including English, Persian, Arabic, Urdu, Hausa, Amharic, Tigrinya, Oromo and Indonesian.
Because their team’s internal language coverage was limited, multilingual capability was essential. Zentropi enabled them to apply a policy written in English across a broad and diverse set of languages, writing systems, and regional contexts, with high accuracy. That significantly expanded the scope of analysis they were able to conduct with confidence.
Applying these learnings to future research
As the Oversight Board continues to use Zentropi across projects, they are exploring additional ways to structure labelers depending on the research goal.
Although for this project they used one broad policy framework across the full dataset in order to explore the boundaries of how the policy applied, in upcoming projects where the goal is less exploratory and more tied to a tightly scoped analysis, they will use multiple smaller, stricter labelers that can quickly and accurately classify different thematic segments of a dataset.
The ability to easily loosen or tighten policy application is a major advantage for the Board. It helps them surface edge cases, incorporate them into their research and identify policy blind spots much earlier in the process.