Fact-Checking Prompt
Fact-Checking Prompt
AI Safety & Governance
The Prompt
The Logic
1. Claim Extraction Prevents “Vibes Checking” and Forces Measurable Verification
WHY IT WORKS: Many so-called fact checks evaluate an article’s overall tone or political alignment rather than verifying specific claims. Breaking content into discrete, checkable statements converts an amorphous task (“is this true?”) into a structured verification workflow. This also prevents the “truthy narrative” problem: a story can feel plausible while containing several wrong specifics. Claim extraction ensures each number, date, entity, and causal statement is tested. It also improves transparency: readers can see exactly what was checked and what wasn’t. In professional workflows, claim lists are the backbone of verification because they create auditability and reproducibility.
EXAMPLE: A post says “Country X cut emissions 40% since 2010, saving $2B in healthcare costs, according to a UN report.” That’s at least three claims: (1) emissions change magnitude and baseline, (2) healthcare savings amount, (3) source attribution (“UN report”). A claim-based fact check will verify each separately, often finding that one is accurate while another is overstated. When teams adopt claim extraction, they reduce error rates in published summaries because they stop repeating unverified numbers. It also helps prioritize: a wrong “$2B” claim might matter more than a minor date error.
2. Evidence Hierarchies Reduce Misinformation Amplification by Favoring Primary Sources
WHY IT WORKS: The internet is full of circular reporting: one article cites another which cites a tweet. An evidence hierarchy prioritizes primary sources (official data, filings, peer-reviewed papers, direct transcripts) over secondary commentary. This reduces the risk of amplifying unverified claims. Reliability scoring forces explicit judgment about source quality: primary vs. secondary, conflict of interest, outdatedness, and methodology transparency. This makes the fact check defensible and reduces “appeal to authority” mistakes where a reputable outlet is treated as proof without confirming its underlying sources.
EXAMPLE: Financial claim: “Company Y revenue grew 50% last quarter.” The correct primary source is the company’s earnings report (10-Q), not a blog summary. By forcing a primary-source-first plan, you avoid errors where a blog confuses revenue with bookings. In health claims, peer-reviewed studies and government datasets (CDC, WHO) outrank influencer posts. When the evidence hierarchy is explicit, you can also mark uncertainty: “Only secondary sources exist; claim remains unproven.” This prevents overconfident conclusions.
3. Verdict Labels Separate Unknown From False, Preventing Overconfident Claims
WHY IT WORKS: Many fact checks wrongly equate “we couldn’t verify it” with “it’s false,” or they assume “sounds false” without evidence. A disciplined system uses verdict labels: True, False, Misleading (partly true but missing context), Unproven (insufficient evidence), and Unclear (ambiguous claim). This reduces unjustified certainty. It also helps users: knowing something is unproven suggests caution, not dismissal. In governance contexts, this precision is essential because decisions can depend on evidence quality.
EXAMPLE: A claim: “A new law bans encryption.” If evidence is unclear because the bill is proposed, not passed, the right label is Misleading or Unproven depending on context. Another claim: “Study proves coffee causes cancer.” The evidence might show correlation in one cohort but not causation; verdict: Misleading. By using clear labels, you avoid sensationalism and maintain credibility. This also supports automation: systems can route “Unproven” items to further research and treat “False” items differently (e.g., correction required).
4. Inconsistency Detection Catches Framing Tricks That Avoid Direct Lies
WHY IT WORKS: Much misinformation is not outright falsehood but misleading framing: cherry-picked time windows, base rate neglect, omitted denominators, correlation-causation confusion, and quote mining. An inconsistency module looks for internal contradictions (numbers that don’t add up), missing context (per-capita vs total), and rhetorical sleights (anonymous sourcing presented as fact). This is crucial because you can “fact check” individual sentences as true while the overall implication is misleading. Inconsistency detection reveals the gap between literal truth and honest interpretation.
EXAMPLE: “Crime increased 20%” may be true for one month vs previous month, but misleading if year-over-year crime decreased. Or “vaccine adverse events doubled” may be true due to reporting changes, not actual harm. In financial reporting, “profits fell 30%” might omit that profits were unusually high last year (base effect). Inconsistency detection forces the report to include denominators, time frames, and alternative baselines. This reduces manipulation and improves reader understanding.
5. Corrected Summaries Prevent the “Debunk Without Replacement” Problem
WHY IT WORKS: Simply labeling a claim false leaves a vacuum; readers often retain the original misinformation due to repetition and lack of an alternative coherent narrative. A corrected summary provides a replacement story using verified facts only, with uncertainty notes. This supports better memory encoding: people remember the corrected explanation rather than the debunk label. It also enables downstream reuse: journalists, analysts, and teams can publish the corrected version as an accurate update.
EXAMPLE: If a claim “Policy X caused unemployment to rise” is unproven, the corrected summary might say: “Unemployment rose from A to B between dates; economists cite multiple contributing factors (inflation, interest rates); no direct causal evidence links the change to Policy X.” This avoids leaving readers with “everything is wrong” confusion. In corporate settings, corrected summaries reduce rework because teams can copy verified text into reports without repeatedly re-checking the same claims.
6. Open Questions Lists Create Responsible Uncertainty and Next-Step Research
WHY IT WORKS: Some claims are not currently verifiable (ongoing investigations, unpublished data, conflicting sources). Listing open questions clarifies what is unknown and what evidence would resolve it. This prevents speculation and guides further research. It also improves transparency: stakeholders can see limitations rather than assuming completeness. In high-stakes environments, this protects credibility because you don’t overpromise certainty.
EXAMPLE: In a breaking news event, casualty numbers vary. The correct approach: verify what is confirmed, list discrepancies, and specify what would resolve uncertainty (official updates, hospital reports). In product claims, list what needs lab testing or third-party certification. Teams that explicitly track open questions avoid reintroducing rumors later and create a clear research backlog. This also helps governance: unresolved high-impact claims can trigger “do not publish” or “label as unverified” policies until resolved.
Example Output Preview
Sample: Fact-Checking a Viral Post About a “New Tax Law”
Claim List (Excerpt): (1) “A new federal law imposes a 5% tax on all digital transactions starting July 1.” (2) “Congress passed it unanimously.” (3) “It applies to Venmo, PayPal, and credit cards.” (4) “It will raise $200B per year.”
Verification Results (Excerpt): Claim (1): UNPROVEN—no bill number or official text provided; no primary source identified. Claim (2): FALSE—unanimous passage not supported by congressional records. Claim (3): MISLEADING—transaction reporting rules exist, but not a flat “5% tax.” Claim (4): UNPROVEN—no budget analysis provided; figure inconsistent with baseline digital payment volume.
Corrected Summary: No verified evidence supports a new 5% federal tax on all digital transactions starting July 1. Existing regulations may require reporting for certain transaction types, but that is different from a tax. Readers should verify using official congressional records and bill text before sharing.
Open Questions: If the claim references a specific bill, what is the bill number? What official budget scoring exists? Which agencies are involved?
Prompt Chain Strategy
Step 1: Structured Fact-Check Report
Prompt: Use the main Fact-Checking Prompt with the content to verify.
Expected Output: A claim-based verification report with verdicts, evidence quotes, corrected summary, and open questions.
Step 2: Expand Evidence Collection for Priority Claims
Prompt: “For the top 5 priority claims, propose a deeper evidence plan: primary sources to retrieve, search queries, who to contact, and how to validate. Then produce a revised verification table with stronger evidence.”
Expected Output: A deeper research plan and improved confidence levels for high-impact claims.
Step 3: Build a Publishable “Correction + Context” Article
Prompt: “Turn the corrected summary into a publishable article: explain what’s true, what’s false, why confusion happened, and how readers can verify. Include a checklist for spotting misinformation.”
Expected Output: A reader-friendly correction piece that can be shared publicly to counter misinformation.
Human-in-the-Loop Refinements
Maintain a “Source Reliability” Rubric for Your Organization
Create a rubric: primary documents (highest), peer-reviewed research, official data, reputable journalism, commentary, social media (lowest). Technique: require at least one primary source for high-stakes claims.
Build a Claims Database to Avoid Re-Fact-Checking the Same Meme
Store verified claims and citations so teams can respond quickly when the same misinformation returns. Technique: tag by topic and verdict for fast retrieval.
Use “Prebunking” for Predictable Misinfo Cycles
For recurring topics, publish verification guides ahead of time. Technique: anticipate claims and prepare primary sources and explainer templates.
Require Reviewer Sign-Off for High-Risk Topics
Health, finance, and legal content should require a subject matter reviewer. Technique: define SLA (e.g., 24h) and escalation if reviewer unavailable.
Track Corrections and Measure “Time to Correction”
Measure how quickly misinformation is corrected after detection. Technique: set a target (e.g., <48h) and run retrospectives on misses.
Teach Writers to Cite as They Write
Most verification pain comes from missing citations. Technique: require citations per paragraph during drafting, not after.