Skip to content
XESO
HomePricing
Docs
Help centerRelease notesSecurityRoadmapStatus
Sign in
Start free
Public evidence

Benchmark methodology

Each public benchmark has a named measurement, source, cadence, target, and known limitation so the number is useful without overstating certainty.
10 methods
Weekly cadence
Source-linked
Import fidelity

YouTube transcript success rate

MeasuresThe share of sampled public YouTube videos that produce title, timestamp, and transcript text without manual repair.
Data sourceScheduled sampled importer runs against the public YouTube corpus used by the eval harness.
CadenceWeekly, with failing samples retained for regression review.
Target99.0% successful imports.
Known limitPrivate, region-blocked, deleted, or captionless videos can fail outside XESO's control.
Import fidelity

Notion-via-Obsidian import fidelity

MeasuresContent preserved through the legacy Notion export path that first converts through an Obsidian-compatible representation.
Data sourceSynthetic round-trip runs over the golden Notion fixture set.
CadenceWeekly until the dedicated importer fully replaces the legacy path.
Target90.0% fixture fidelity.
Known limitLegacy export shape can collapse some database views and Notion-only block metadata.
Answer quality

Segment summary human-rating

MeasuresBlind 5-point human rating of generated segment summaries for correctness, completeness, and style.
Data sourceSampled summary outputs from the eval corpus, reviewed against source excerpts.
CadenceWeekly sampled review.
Target4.50 / 5.00 average rating.
Known limitHuman review samples are intentionally small, so this is a release signal rather than a statistical guarantee.
Answer quality

Contradiction-detector precision

MeasuresPrecision on labelled contradiction pairs where XESO should flag conflicting claims without over-warning.
Data sourceVersioned contradiction gold set run by the eval harness.
CadenceWeekly and before major retrieval or reasoning changes.
Target80.0% precision.
Known limitRecall and temporal nuance are tracked separately because precision is the public trust budget.
Latency

Chat latency (p95, first token)

MeasuresServer-observed p95 time to first streamed token for cited chat answers.
Data sourcePrivacy-safe production route timing aggregated over the trailing seven days.
CadenceDaily rollup, shown as the latest weekly benchmark card.
Target600 ms p95 first-token latency.
Known limitNetwork distance, cold starts, and upstream model queues can move individual sessions outside the p95 budget.
Performance

Lighthouse LCP (p75, public pages)

Measuresp75 Largest Contentful Paint across core public pages including home, pricing, and benchmarks.
Data sourceScheduled Lighthouse CI runs, reconciled with field Web Vitals when available.
CadenceWeekly, and after public-page visual changes.
Target1,800 ms p75 LCP.
Known limitLab runs do not perfectly model every device or network, so RUM remains the release guardrail.
Import fidelity

Notion dedicated-importer fidelity

MeasuresRound-trip fidelity for the dedicated Notion importer against blocks, hierarchy, links, and source metadata.
Data sourceGolden corpus fixture imports with deterministic content comparison.
CadenceWeekly and before importer releases.
Target90.0% fixture fidelity.
Known limitThird-party export changes can affect coverage before the fixture set is refreshed.
Import fidelity

Evernote ENEX importer fidelity

MeasuresPreservation of Evernote ENEX notes, attachments, timestamps, and notebook structure.
Data sourceGolden ENEX fixture round trips through the importer harness.
CadenceWeekly and before importer releases.
Target95.0% fixture fidelity.
Known limitEncrypted or proprietary attachment payloads are measured separately from text fidelity.
Import fidelity

Roam EDN importer fidelity

MeasuresPreservation of Roam EDN block hierarchy, backlinks, daily notes, and text content.
Data sourceGolden Roam fixture imports with structural comparison.
CadenceWeekly and before importer releases.
Target85.0% fixture fidelity.
Known limitGraph semantics without explicit source export data may require best-effort reconstruction.
Import fidelity

Obsidian importer fidelity

MeasuresPreservation of Markdown content, folders, frontmatter, wikilinks, and attachments.
Data sourceGolden Obsidian vault fixtures and source-matrix round trips.
CadenceWeekly and before importer releases.
Target97.0% fixture fidelity.
Known limitPlugin-specific syntax is preserved as text unless XESO has a native parser for that extension.
XESOSaves what you read. Pulls the right passage when you ask, with a link back to the source.
All systems operational
Product
  • Home
  • Pricing
  • Help
Legal
  • Terms
  • Privacy
  • Refunds
  • DPA
  • Subprocessors
Trust
  • Security
  • Roadmap
  • Status
Talk to usReplies from a human, usually within a day.
  • hello@xeso.aiGeneral
  • Help centreDocs & FAQs
  • © 2026 XESO · Built for readers, researchers, and builders.