Moderator handbook — policy, protocol, psychological safety

Content moderation is high-stakes work. OraData offers this service to AI companies and social networks. These are the non-negotiable rules every moderator must know before their first batch.

1. Policy — you apply the client's, not your own

OraData doesn't invent the policy. Each client (social network, AI company) ships their own grid: what's allowed, what's banned, the gray zones. Your job is to apply consistently, not debate.

Read the batch policy BEFORE touching the first item.
Unsure on a case → mark "unsure" and pass to an arbiter — never guess.
Your personal opinions (political, religious, moral) stay out of the batch. If a batch directly clashes with your values, you can refuse — your right, no penalty.

2. The 5 standard categories

Hate speech: targets a protected group (ethnicity, religion, orientation, disability). Satire vs harassment = judgment.
Violence: real vs artistic vs news documentation. Default: document public info, flag gratuity.
NSFW: nudity. Historical art → keep. Explicit → per client policy. Minors → CSAM protocol (§4).
Misinformation: objectively falsifiable. Verified false → label. Contestable opinion → out of scope.
Self-harm / suicide: always priority — resource banner + urgent moderation, never ignore.

3. Double-blind decision protocol

Any irreversible decision (permanent delete, uploader ban) requires double-blind validation: two qualified moderators, neither sees the other's call. Disagreement → senior arbiter. Minimum inter-moderator agreement: Cohen's Kappa ≥ 0.75.

4. CSAM — immediate escalation protocol

5. Psychological safety — non-negotiable

Daily exposure cap: enforced by the platform. Once hit, you cannot accept more moderation tasks today.
Mandatory break every N decisions/hour (dynamic per batch difficulty).
Mandatory rotation after a graphic batch — 24h minimum before another sensitive batch.
Persistent stress: Settings → Moderator wellbeing → free consultation with partner psychologist (video, your choice of language).

6. Confidentiality

NDA signed before your first batch. Content seen during moderation stays confidential — can't discuss outside a session, not even with loved ones.
Client request to reveal moderator identity: denied unless legally documented.
Conflict of interest (you recognize someone): signal support immediately, don't moderate.

7. Pay

Moderation pays per validated batch (volume + inter-moderator agreement). Extreme content carries a 1.5× multiplier. Quality bonus if your consistency score on a batch exceeds 95%. You can refuse any batch with no explanation or penalty; forced exposure does not exist on OraData.

8. Chain of responsibility

Client: defines policy + endorses final decisions.
OraData: provides platform, trained moderators, traceability.
Moderator: applies policy + flags gray zones.
If a decision is legally challenged, all 3 signatures are on it. No single link is the sole bearer.

9. Continuous training

Mandatory refresher every 3 months: policy re-read + new training module + psy check-in. You can refuse a campaign whose policy you don't endorse — no penalty, no tier drop.

10. Legal frameworks you should know

Content you moderate may fall under different legal regimes depending on the uploader's and client's country. You are not a lawyer — but you must know when to escalate.

European Union: DSA (Digital Services Act) + GDPR. Major platforms must respond to illegality reports within 24h.
United States: Section 230 protects the platform + NCMEC CSAM reporting mandatory within 90 days (18 USC § 2258A).
France / Belgium / Switzerland: CSAM escalation via pharos.gouv.fr + Child Focus + OraData hotline contact.
CEMAC: country-specific content-illegality legislation is still being formalized. In doubt, escalate to an OraData arbiter.
Every country: distributing images of minors without parental consent = universal criminal offense. No cultural exception.

11. Decision tree — graphic violence

Is it public-interest documentation (armed conflict, disaster, public accident)? → Keep + add trigger warning + blur non-consented identities.
Is it artistic representation (film, theater, fiction)? → Apply age-rating per client policy. Keep if rating is consistent.
Is it gratuitous violence targeting an identifiable real person (filmed beating for humiliation, torture)? → Block immediately + flag uploader.
Is it glorifying (celebrating a violent act, inciting imitation)? → Block + immediate uploader ban.
Ambiguous? → Mark unsure, pass to arbiter. NEVER GUESS.

12. Decision tree — hate speech

Does it target a group protected by client policy (ethnicity, religion, orientation, disability, gender, nationality)?
If no → out of hate scope, may fit another category.
If yes → is it factual (statistic, neutral observation)? → Keep.
Is it satirical with a target clearly framed by context (recognized comedian, editorial context)? → Keep + flag for client review if borderline.
Does it directly incite violence or discrimination ("they should be", "death to")? → Block immediately + flag uploader.
Is it defamation / targeted harassment of an identified individual? → Block + refer to legal support via arbiter.
Language you don't fluently understand? → Don't guess, hand off to a native-speaker moderator.

13. Decision tree — misinformation

Misinformation = factually verifiable claim that turns out false. Contestable opinion ≠ misinformation, stays out of scope.

Is the claim objectively verifiable (public statistic, historical fact, consensus science)?
If not (opinion, prediction, interpretation) → out of scope.
If yes → is it verified false by a reliable source (WHO, peer-reviewed journals, recognized fact-checker like AFP/Reuters/Africa Check)?
If verified false → add "disputed content" label + link to reliable source. NEVER delete a misinformation post — labelling is enough.
If not yet verified → mark unsure, escalate to arbiter who has partner fact-check tool access.
Public health / imminent danger (e.g. false info that can kill) → urgent-priority signal — speed matters.

14. Resilience — individual + collective

Moderating difficult content leaves traces, even with every precaution. OraData takes it seriously because a burnt-out moderator makes bad decisions and ends up harming the client as much as themselves.

Recommended personal practices

Bounded hours: no night moderation, no more than 4h/day continuous.
Closing ritual at end of session: close tabs, screen off for 10 min, walk outside.
Do not share content seen with your circle — even to "decompress". Transfers trauma without relieving it.
Keep a private journal of cases that stuck — not the details, your feelings. Feeds the psy check-ins.

Warning signs to watch in yourself

Trouble sleeping, recurring nightmares.
Involuntary flashbacks during the day.
Irritability, emotional distance from loved ones.
Avoidance of activities or places tied to what you saw.
Increased alcohol or other substance use.
Intrusive thoughts hard to dismiss.

Peer-to-peer support

Moderator WhatsApp channel: peer-support community moderated by a partner psychologist. Request access via support after your first batch.
Voluntary buddy: request a buddy moderator for mirrored sessions — not the same decision, just the presence of someone who understands.
Monthly meetings (optional): anonymized experience sharing in video call, hosted by a psy.
If you spot signs in a peer: tell them you're worried, and softly signal support. You can save a teammate without betraying them.

15. Your rights as a moderator

Right to refuse a batch: any time, no explanation, no penalty. No tier drop.
Right to confidentiality: your identity is never shared with the client without your explicit written consent + legally documented request.
Right to pause: put your moderator account on indefinite pause from Settings. Resume after a refresher.
Right to free psy consultation: covered by OraData, unlimited in time, independent of a support ticket.
Right to partial-batch pay if released: if opening a batch reveals content beyond your tolerance, release it — you are paid for time effectively spent up to release.
Right to unionize: OraData doesn't oppose any collective moderator structure. Legally, culturally — you are an independent worker with the protections your country affords.

Burn in before your first batch

Apply client policy, not yours.
Mark unsure when hesitating — never guess.
CSAM → hotline, never delete alone.
Self-harm → resource banner + urgent.
Double-blind required for irreversible decisions.
Daily cap enforced — never bypass.
NDA signed + strict confidentiality.
Conflict of interest → signal, don't moderate.
Justification on every decision.
Persistent stress → talk to support. Not weakness.

OraData · guide public · révisé 2026

Photo de couverture : Markus Winkler · Unsplash