Moderator handbook — policy, protocol, psychological safety
Content moderation is high-stakes work. OraData offers this service to AI companies and social networks. These are the non-negotiable rules every moderator must know before their first batch.
1. Policy — you apply the client's, not your own
OraData doesn't invent the policy. Each client (social network, AI company) ships their own grid: what's allowed, what's banned, the gray zones. Your job is to apply consistently, not debate.
- Read the batch policy BEFORE touching the first item.
- Unsure on a case → mark "unsure" and pass to an arbiter — never guess.
- Your personal opinions (political, religious, moral) stay out of the batch. If a batch directly clashes with your values, you can refuse — your right, no penalty.
2. The 5 standard categories
- Hate speech: targets a protected group (ethnicity, religion, orientation, disability). Satire vs harassment = judgment.
- Violence: real vs artistic vs news documentation. Default: document public info, flag gratuity.
- NSFW: nudity. Historical art → keep. Explicit → per client policy. Minors → CSAM protocol (§4).
- Misinformation: objectively falsifiable. Verified false → label. Contestable opinion → out of scope.
- Self-harm / suicide: always priority — resource banner + urgent moderation, never ignore.
3. Double-blind decision protocol
Any irreversible decision (permanent delete, uploader ban) requires double-blind validation: two qualified moderators, neither sees the other's call. Disagreement → senior arbiter. Minimum inter-moderator agreement: Cohen's Kappa ≥ 0.75.
4. CSAM — immediate escalation protocol
5. Psychological safety — non-negotiable
- Daily exposure cap: enforced by the platform. Once hit, you cannot accept more moderation tasks today.
- Mandatory break every N decisions/hour (dynamic per batch difficulty).
- Mandatory rotation after a graphic batch — 24h minimum before another sensitive batch.
- Persistent stress: Settings → Moderator wellbeing → free consultation with partner psychologist (video, your choice of language).
6. Confidentiality
- NDA signed before your first batch. Content seen during moderation stays confidential — can't discuss outside a session, not even with loved ones.
- Client request to reveal moderator identity: denied unless legally documented.
- Conflict of interest (you recognize someone): signal support immediately, don't moderate.
7. Pay
Moderation pays per validated batch (volume + inter-moderator agreement). Extreme content carries a 1.5× multiplier. Quality bonus if your consistency score on a batch exceeds 95%. You can refuse any batch with no explanation or penalty; forced exposure does not exist on OraData.
8. Chain of responsibility
- Client: defines policy + endorses final decisions.
- OraData: provides platform, trained moderators, traceability.
- Moderator: applies policy + flags gray zones.
- If a decision is legally challenged, all 3 signatures are on it. No single link is the sole bearer.
9. Continuous training
Mandatory refresher every 3 months: policy re-read + new training module + psy check-in. You can refuse a campaign whose policy you don't endorse — no penalty, no tier drop.
10. Legal frameworks you should know
Content you moderate may fall under different legal regimes depending on the uploader's and client's country. You are not a lawyer — but you must know when to escalate.
- European Union: DSA (Digital Services Act) + GDPR. Major platforms must respond to illegality reports within 24h.
- United States: Section 230 protects the platform + NCMEC CSAM reporting mandatory within 90 days (18 USC § 2258A).
- France / Belgium / Switzerland: CSAM escalation via pharos.gouv.fr + Child Focus + OraData hotline contact.
- CEMAC: country-specific content-illegality legislation is still being formalized. In doubt, escalate to an OraData arbiter.
- Every country: distributing images of minors without parental consent = universal criminal offense. No cultural exception.
11. Decision tree — graphic violence
- Is it public-interest documentation (armed conflict, disaster, public accident)? → Keep + add trigger warning + blur non-consented identities.
- Is it artistic representation (film, theater, fiction)? → Apply age-rating per client policy. Keep if rating is consistent.
- Is it gratuitous violence targeting an identifiable real person (filmed beating for humiliation, torture)? → Block immediately + flag uploader.
- Is it glorifying (celebrating a violent act, inciting imitation)? → Block + immediate uploader ban.
- Ambiguous? → Mark unsure, pass to arbiter. NEVER GUESS.
12. Decision tree — hate speech
- Does it target a group protected by client policy (ethnicity, religion, orientation, disability, gender, nationality)?
- If no → out of hate scope, may fit another category.
- If yes → is it factual (statistic, neutral observation)? → Keep.
- Is it satirical with a target clearly framed by context (recognized comedian, editorial context)? → Keep + flag for client review if borderline.
- Does it directly incite violence or discrimination ("they should be", "death to")? → Block immediately + flag uploader.
- Is it defamation / targeted harassment of an identified individual? → Block + refer to legal support via arbiter.
- Language you don't fluently understand? → Don't guess, hand off to a native-speaker moderator.
13. Decision tree — misinformation
Misinformation = factually verifiable claim that turns out false. Contestable opinion ≠ misinformation, stays out of scope.
- Is the claim objectively verifiable (public statistic, historical fact, consensus science)?
- If not (opinion, prediction, interpretation) → out of scope.
- If yes → is it verified false by a reliable source (WHO, peer-reviewed journals, recognized fact-checker like AFP/Reuters/Africa Check)?
- If verified false → add "disputed content" label + link to reliable source. NEVER delete a misinformation post — labelling is enough.
- If not yet verified → mark unsure, escalate to arbiter who has partner fact-check tool access.
- Public health / imminent danger (e.g. false info that can kill) → urgent-priority signal — speed matters.
14. Resilience — individual + collective
Moderating difficult content leaves traces, even with every precaution. OraData takes it seriously because a burnt-out moderator makes bad decisions and ends up harming the client as much as themselves.
Recommended personal practices
- Bounded hours: no night moderation, no more than 4h/day continuous.
- Closing ritual at end of session: close tabs, screen off for 10 min, walk outside.
- Do not share content seen with your circle — even to "decompress". Transfers trauma without relieving it.
- Keep a private journal of cases that stuck — not the details, your feelings. Feeds the psy check-ins.
Warning signs to watch in yourself
- Trouble sleeping, recurring nightmares.
- Involuntary flashbacks during the day.
- Irritability, emotional distance from loved ones.
- Avoidance of activities or places tied to what you saw.
- Increased alcohol or other substance use.
- Intrusive thoughts hard to dismiss.
Peer-to-peer support
- Moderator WhatsApp channel: peer-support community moderated by a partner psychologist. Request access via support after your first batch.
- Voluntary buddy: request a buddy moderator for mirrored sessions — not the same decision, just the presence of someone who understands.
- Monthly meetings (optional): anonymized experience sharing in video call, hosted by a psy.
- If you spot signs in a peer: tell them you're worried, and softly signal support. You can save a teammate without betraying them.
15. Your rights as a moderator
- Right to refuse a batch: any time, no explanation, no penalty. No tier drop.
- Right to confidentiality: your identity is never shared with the client without your explicit written consent + legally documented request.
- Right to pause: put your moderator account on indefinite pause from Settings. Resume after a refresher.
- Right to free psy consultation: covered by OraData, unlimited in time, independent of a support ticket.
- Right to partial-batch pay if released: if opening a batch reveals content beyond your tolerance, release it — you are paid for time effectively spent up to release.
- Right to unionize: OraData doesn't oppose any collective moderator structure. Legally, culturally — you are an independent worker with the protections your country affords.
Burn in before your first batch
- Apply client policy, not yours.
- Mark unsure when hesitating — never guess.
- CSAM → hotline, never delete alone.
- Self-harm → resource banner + urgent.
- Double-blind required for irreversible decisions.
- Daily cap enforced — never bypass.
- NDA signed + strict confidentiality.
- Conflict of interest → signal, don't moderate.
- Justification on every decision.
- Persistent stress → talk to support. Not weakness.
OraData · guide public · révisé 2026
Photo de couverture : Markus Winkler · Unsplash