Methodology

How these launch pages are audited.

Honest Launches does not try to prove universal truth. It audits how well launch language is supported by a dated evidence packet and rewrites claims that overrun the available evidence.

What the archive is

Each published page is an audited modified version of an original launch document. The goal is to preserve the structure and rhetorical flow of the source while tightening claims that are unsupported, stretched, or missing critical context.

Primary surface
Audited Launch Page
Secondary surface
Audit Ledger
Publish flow
Manual

Evidence packet

The intended source-of-truth hierarchy is:

  • Launch post, model card, system card, eval appendix, and pricing or availability docs when relevant.
  • Benchmark-native references and public datasets when a launch makes benchmark or ranking claims.
  • Secondary structured sources like leaderboards or aggregators as challenge sources, not sole authority.
Current archive standard Published tooltip citations are anchored to explicit reference URLs whenever possible. This keeps the receipts legible and avoids relying on vague web retrieval alone.

Verdicts

Each audited claim receives one mutually exclusive verdict.

Supported

The available evidence supports the claim as written or with only minor caveats.

Overstated

The evidence points in the same direction, but the wording is stronger, broader, or more certain than supported.

Missing context

The claim may be directionally true, but key scope, baseline, methodology, denominator, or source context is absent.

Contradicted

The available evidence conflicts with the claim as written.

Not checkable

The available evidence does not provide enough support to verify or falsify the claim.

How a page is built

  • The document is ingested and converted into a source snapshot that preserves the main content structure.
  • Risky factual claims are extracted from the source.
  • Specialist audit passes look for support, caveats, contradictory evidence, and numeric inconsistencies.
  • Audited claims are mapped back onto the preserved page structure and rewritten inline.
  • Hover or tap on the underlined replacement opens the original wording, verdict, and explicit-reference citations.

What the archive is not

  • It is not a universal benchmark of which lab or model is best overall.
  • It is not a pure fact-checking authority that proves all claims true or false.
  • It is not yet a legal or compliance certification layer.
  • It is not an open live-generation product for arbitrary URLs in the current version.

Current limits

  • Best results come from article-like launch pages and public PDFs.
  • Some pages are rendered from local proxies or PDFs when the original page is fetch-blocked.
  • The current launch-page renderer is high-fidelity, but not a full browser-grade clone of arbitrary sites.
  • Publication decisions are still manual. The archive is generated, inspected, and then published intentionally.

Evaluation and gate

The archive is also judged against an internal evaluation rubric. Each featured page should pass a publication gate, expose an evidence packet, and contribute to a reviewed gold set used to score verdict and rewrite quality over time.

  • Publication gate: original-source link, modified-version notice, evidence packet, anchored rewrites, cited rewrites, preserved source blocks.
  • Gold set: reviewed claim-level checks for verdict agreement and rewrite adequacy.
  • Archive quality score: aggregate view of legibility, trust, judgment quality, fidelity, and archive coherence.