Public evidence

Benchmark methodology

Each public benchmark has a named measurement, source, cadence, target, and known limitation so the number is useful without overstating certainty.

10 methods

Weekly cadence

Source-linked

Import fidelity

YouTube transcript success rate

MeasuresThe share of sampled public YouTube videos that produce title, timestamp, and transcript text without manual repair.

Data sourceScheduled sampled importer runs against the public YouTube corpus used by the eval harness.

CadenceWeekly, with failing samples retained for regression review.

Target99.0% successful imports.

Known limitPrivate, region-blocked, deleted, or captionless videos can fail outside XESO's control.

Import fidelity

Notion-via-Obsidian import fidelity

MeasuresContent preserved through the legacy Notion export path that first converts through an Obsidian-compatible representation.

Data sourceSynthetic round-trip runs over the golden Notion fixture set.

Cadence

Answer quality

Segment summary human-rating

MeasuresBlind 5-point human rating of generated segment summaries for correctness, completeness, and style.

Data sourceSampled summary outputs from the eval corpus, reviewed against source excerpts.

Cadence

Answer quality

Contradiction-detector precision

MeasuresPrecision on labelled contradiction pairs where XESO should flag conflicting claims without over-warning.

Data sourceVersioned contradiction gold set run by the eval harness.

Cadence

Latency

Chat latency (p95, first token)

MeasuresServer-observed p95 time to first streamed token for cited chat answers.

Data sourcePrivacy-safe production route timing aggregated over the trailing seven days.

Cadence

Performance

Lighthouse LCP (p75, public pages)

Measuresp75 Largest Contentful Paint across core public pages including home, pricing, and benchmarks.

Data sourceScheduled Lighthouse CI runs, reconciled with field Web Vitals when available.

Cadence

Import fidelity

Notion dedicated-importer fidelity

MeasuresRound-trip fidelity for the dedicated Notion importer against blocks, hierarchy, links, and source metadata.

Data sourceGolden corpus fixture imports with deterministic content comparison.

Cadence

Import fidelity

Evernote ENEX importer fidelity

MeasuresPreservation of Evernote ENEX notes, attachments, timestamps, and notebook structure.

Data sourceGolden ENEX fixture round trips through the importer harness.

Cadence

Import fidelity

Roam EDN importer fidelity

MeasuresPreservation of Roam EDN block hierarchy, backlinks, daily notes, and text content.

Data sourceGolden Roam fixture imports with structural comparison.

CadenceWeekly and before importer releases.

Import fidelity

Obsidian importer fidelity

MeasuresPreservation of Markdown content, folders, frontmatter, wikilinks, and attachments.

Data sourceGolden Obsidian vault fixtures and source-matrix round trips.

Cadence