XESO

Release notes

Every shipped change is documented in the repo's CHANGELOG.md. Security-relevant items are tagged explicitly. The 8 most recent sections are below.

\[Unreleased] — 2026-04 — "Model Sovereignty (AI1–AI3)"

First shipped milestone of the XESO AI Strategy. See [`docs/ai/strategy.md`](docs/ai/strategy.md) for the full plan; the confidential canonical lives in `documentation/AI_STRATEGY.md`. ### Added — Agentic surfaces - **Research Agent** (`/research`, `POST /api/agents/research`) — plans sub-queries, fan-outs `graduatedSearch` in parallel, de-duplicates passages, drafts a cited synthesis, self-reviews with a critic agent, and revises up to 2× before returning. Counts toward the chat rate- limit bucket. Runs over the user's own brain only. - **Teacher Agent** (`POST /api/agents/teach`, `StudyPanel` on every note) — generates flashcards, multiple-choice quizzes, or Socratic dialogues strictly grounded in the note's content and source locations. Ephemeral today; a future pass can persist into the existing `review_cards` deck. - **YouTube batch ingest** — pasting a playlist, `@handle`, `/channel/`, `/c/`, or `/user/` URL into Quick Add now opens a preview dialog listing up to 25 videos and fans out to `POST /api/ingest` at 500 ms between calls so the rate limiter and per-source budget check apply per video. - **Model Context Protocol tools** (`/api/mcp`) expanded to include `get_facts`, `research`, `generate_flashcards`, `generate_quiz` so external clients (Cursor, Claude Desktop, a future XESO CLI) reach the same agents as the web UI. - **Training-data consent** (`Settings → Privacy`) — explicit opt-in for "help improve XESO models". Off by default. Backed by `users.consent_training` + `consent_training_at` columns (migration `0053_training_consent.sql`), enforced by the distillation script (`scripts/distill/generate-dataset.ts`) which also hashes user IDs and excludes vaulted content. ### Added — Fine-tuning scaffolding (not yet in prod) - `scripts/distill/generate-dataset.ts` — replays retrieval for consenting users and asks Gemini 2.5 Pro as a teacher model to produce Alpaca- style JSONL for QLoRA fine-tuning. - `scripts/finetune/xeso-mind-v1.axolotl.yaml` — Axolotl QLoRA config targeting `google/gemma-2-9b-it`. - `docs/ai/per-user-lora-adapters.md` — design doc for the vLLM/LoRAX per-user adapter rollout. Not yet deployed. ### Added — Docs - `docs/ai/strategy.md`, `docs/ai/wedge-strategy.md`, `docs/ai/claude-code-learnings.md` committed to the repo. - `documentation/adr/ADR-AI-003-model-sovereignty-program.md` and updates to `documentation/XESO-MASTER.md`, `documentation/INVESTOR_ONE_PAGER.md`, and `documentation/ROADMAP.md` (confidential tier, not committed). ### Changed — Navigation - Nav rail now includes a **Research** entry between Chat and Review. - Command palette adds **"Go to research"** to the quick-nav section. ### Added — 90-day plan execution (Wk2–Wk10) Concrete shipped work against §21 of the AI Strategy. Each item is wired end-to-end (API + UI + tests where applicable) and guarded with unit tests under `tests/`. - **Wk3 — Ultraplan pattern on `/research`.** `POST /api/agents/research` now accepts `reviewPlan: true` (returns just the sub-query plan, `~1s`) or `approvedPlan` (skips the planner LLM on the real run). The UI exposes two modes — "Review plan first" (default, differentiating) and "Express" (original one-shot behavior). Users can add, remove, rephrase sub-queries and flip intent before approving. This is the single biggest Claude-Code-learning adoption in the agent surface. - **Wk5 — User-editable facts UI.** New `/settings/memory` page + CRUD at `/api/memory/facts` (GET/POST/PATCH/DELETE, audit-logged, rate- limited, owner-scoped). Low-confidence filter, optimistic updates, deletion uses the existing `supersededBy = id` retirement sentinel so audit trail is preserved. Closes R5 ("false-positive inferred facts"). - **Wk6 — Kairos-style memory consolidation cron.** New `lib/services/memory/consolidate.ts` + `/api/cron/memory-consolidate` - `infra/cloud-scheduler/jobs.yaml` entry (`17 3 * * *` UTC). Three passes per user per run: merge duplicate `(subject, predicate)` rows, retire low-confidence stale rows, repair transitive supersession chains. Idempotent; per-user transactional; caps at 200 users per tick. - **Wk8 — Prompt-injection-via-source sanitizer.** New `lib/services/ingestion/sanitize.ts` (10 attack-pattern regexes: ignore-previous, disregard-previous, forget-everything, new- instructions, you-are-now-DAN, role-switch, hidden-system-tag, exfiltrate-key, http-sink, markdown-autofetch). Wired into `lib/services/ingestion/pipeline.ts` for `youtube | web | pdf` sources before hash + LLM. Plus `fenceUntrustedSource` helper for prompt-time wrapping. Covered by `tests/sanitize-source-text.test.ts` (11 assertions, all passing). Three new `leakage-v0.jsonl` rows (l011–l013) test the same attack classes against the chat path. - **Wk9 — Memory-evolution surface.** `GET /api/memory/evolution` returns pairs of (superseded, current) facts for the same `(subject, predicate)` key. Rendered inside `/settings/memory` as "How your thinking evolved" cards (from → to + confidence delta + timestamp). The trust payoff of the supersession mechanic becomes visible. - **§6.1 — Per-turn USD tracking.** `chatMonitor.recordTurnCost({ model, inputTokens, outputTokens })` with a vendor-priced lookup (`estimateTurnUsd`) for Gemini 2.5 Pro/Flash, Claude 3.5 Sonnet, GPT- 4o, and a default floor. Wired into `onFinish` of `/api/chat/route.ts` so every successful turn contributes to the burn-rate series. `getStats()` now returns `cost: { totalUsd, avgUsd, p95Usd, perModel }`. Covered by three new tests in `tests/chat-monitor.test.ts`. - **Wk2 — Burn-rate dashboard.** `/admin/slo` (ops RBAC) now renders success rate, p95 latency, health grade, and a full burn-rate card: spent (15m), projected hourly, projected daily, per-model breakdown, and recent errors. Replaces the prior placeholder. - **Wk10 — Research-agent SSE worklog.** `POST /api/agents/research` now supports `?stream=1` (or `Accept: text/event-stream`) to return a Server-Sent Events stream of lifecycle events (`plan`, `subquery:start`, `subquery:done`, `merged`, `synth:start`, `critic:start`, `critic:done`, `revision:start`, `synth:done`, `done`). A 15-second keep-alive comment prevents intermediary 60-s timeouts. The JSON transport still works for JSON consumers. `/research` uses streaming by default and renders a live worklog (last 8 events + elapsed counter) during the 10–25s run. Cancel-on-unmount via `AbortController` so closing the tab frees server resources. - **Audit events.** Added `memory_fact_created`, `memory_fact_updated`, `memory_fact_retired` to `lib/services/logging/audit.ts` for the new facts CRUD. ### Hardening (10/10 pressure-test pass) - **`fenceUntrustedSource` attribute injection.** The `sourceKind` argument is now validated against `^[a-z0-9_-]{1,24}$` and falls back to `unknown`; a malicious caller can no longer break out of the `kind=""` attribute. Transcripts that literally contain `</untrusted_source>` are also neutralized inside the inner payload so an injection can't close the frame early. - **`PATCH /api/memory/facts` requires a change.** The schema now refuses `{ id }`-only bodies (at least one of subject, predicate, object must be present) so the audit log and confidence bump fire only on genuine edits. - **`estimateTurnUsd` longest-prefix match.** Regression: naive prefix-matching billed `gpt-4o-mini-preview-2024` at `gpt-4o` rates. Now sorts prefixes by length DESC so `gpt-4o-mini` wins. Covered by dedicated test. - **Memory nav discoverability.** `/settings/memory` is now linked from the settings page top nav (Memory button → deep route) and from the top of `/settings/privacy` (Memory card with "Manage memory" CTA). Previously the page existed but was only reachable via direct URL. Verification: - `npx tsc --noEmit` — exit 0. - 822+ vitest assertions green. New files: `tests/memory-consolidate.test.ts` (8 tests for `normKey` edge cases), `tests/research-events.test.ts` (5 tests for agent worklog contract), `tests/research-sse.test.ts` (4 tests for the SSE transport), plus 11 additional sanitizer pressure rows and 5 new cost tracking edge cases. - `npm run docs:check` — 0 broken internal links. ### Added — AI Strategy v2.0 pressure-test rewrite + agent eval harness Follow-up pass on the v1.0 strategy doc after an explicit pressure- test. Every weak section was rated honestly; the rewrite addresses each gap and ships the concrete eval code the strategy references. - **`documentation/AI_STRATEGY.md` v2.0** (confidential, not committed) — 11 sections → 21 sections + 3 appendices. New: north- star metric (GAC), self-audit rating table, hypothesis + falsifier, per-initiative kill-conditions, baseline benchmark numbers, burn-rate telemetry, 5 new competitor teardowns, 5 new risks with early-warning signals, metrics-as-a-contract rewrite, unit economics P&L, 11-row decision log, AI-specific security posture, org design, three-audience framings, capital scenarios, comparative benchmarks, pre-mortem, and a week-by-week 90-day execution plan. - **`docs/ai/strategy.md`** (committed) updated with an **Eval harness (the contract)** table and a **North-star metric** section that mirror the Tier-1 canon. - **`docs/ai/eval-harness.md`** (new, committed) — engineering-facing spec for the eval harness: 10-corpus inventory, how-to-run recipes, PR-scoped vs nightly-scoped runs, grading rules, ship protocol, ownership. - **`scripts/evals/research-v0.jsonl`** (new, 12 rows) — first agent- level eval corpus. Grades the research agent on citation adherence, sub-query plan size, abstain behavior, prompt-injection resistance, and word-count discipline. - **`scripts/evals/teacher-v0.jsonl`** (new, 10 rows) — grades the teacher agent on structural correctness (flashcard count, quiz 4- choice shape + correctIndex range, Socratic `idealAnswerOutline` presence) plus optional phrase-grounding on note content. - **`scripts/evals/run-agents.ts`** (new) — self-contained runner that hits `POST /api/agents/research` and `POST /api/agents/teach`, grades against the expect-block spec, writes `results-agents-<corpus>-<ts>.json`, and exits non-zero on any corpus miss. Teacher rows whose `noteRef` isn't resolved via `EVAL_TEACHER_NOTE_IDS` are SKIPPED (not failed) so the harness can run incrementally. - **`npm run evals:agents`** wired (runs both corpora by default; `CORPUS=research|teacher|all` selects). - **`documentation/index.md §14.5`** updated to reflect the v2.0 rewrite's full section map. ---

\[v4.4.1] — 2026-04-18 — "Reversible by default (N1)"

Tier-0 N1 ships: bulk/single note delete is now reversible across a 30-second window, and the read-paths that previously leaked tombstones have been sealed end-to-end. ### Added — Reversible deletes - **`notes.deleted_at` soft-delete column** — migration `0050_notes_soft_delete_grace.sql` adds `deleted_at timestamptz` plus a partial index on `(user_id, deleted_at) WHERE deleted_at IS NOT NULL` so the reaper scan is O(pending), not O(library). - **Bulk `restore` action** — `POST /api/notes/bulk` now accepts `action: "restore"`. Rows whose `deleted_at` is still inside `NOTE_SOFT_DELETE_UNDO_SECONDS` (30s) come back with their original `segment_id`; segment `note_count` is re-incremented atomically. Outside the window, restore returns `restored: []` (quiet no-op) and the UI surfaces a "some notes could not be restored" toast. - **Reaper cron** — `GET /api/cron/note-soft-delete-reaper` permanently deletes rows older than the grace window (batch of 200, logs `note_soft_delete_reaper reaped=N`). Registered via `infra/cloud-scheduler/jobs.yaml` + `apply.sh` to run every minute against the Cloud Run service; authenticated with `BILLING_CRON_SECRET` (`Authorization: Bearer …` or `?secret=`). ### Changed — Read-path seal (no tombstone leaks) - New helper `lib/db/note-scope.ts#noteIsLive` (= `isNull(notes.deletedAt)`) is the single source of truth for "visible note". Applied to every Drizzle + raw SQL read-path: library feed, unfiled feed, segment feed, quick search (FTS + fuzzy), hybrid search, `getNoteById`, backlinks, related notes, suggest-segment, generate-cards, versions, review-card sources, share links (create + `/a/[token]` render), brain graph, brain source-mix + embedding coverage, analytics weekly growth + top tags, status counts, maintenance stats, BibTeX + single/segment/full exports, weekly digest (recent notes + connections + most-cited), MCP `read_note` / `find_notes` / `cite`, chat `open_note`, dream-cycle re-embed / entity-detection / orphan count / embedding coverage, stale-detection, audio-overview members (segment + evals), onboarding idempotency checks, billing plan-cap count, and the billing reaper's cascade target list. - `DELETE /api/notes/[id]` and `DELETE /api/notes/bulk delete` now stamp `deleted_at = now()` and decrement `segment.note_count`; they no longer `DELETE FROM notes` until the reaper elapses. ### Fixed — Quality gates - `PrivacySettingsClient` — restored missing `freezeConfirmOpen` / `setFreezeConfirmOpen` state that the freeze-account Dialog depended on. - `WebAuthnManager` — replaced the inline `window.confirm` passkey-remove flow with a proper Dialog (`removeTarget` + `removing` state + `confirmRemove()` callback), preserving optimistic list updates. - `ThemeToggle` — swapped the `useEffect(() => setMounted(true), [])` SSR guard for `useSyncExternalStore`, satisfying React 19's `react-hooks/set-state-in-effect` rule without introducing hydration mismatches. ### Tests - `tests/notes-soft-delete.test.ts` — new. Pins: - `NOTE_SOFT_DELETE_UNDO_SECONDS` is in the "undo-feels-like-undo" range (≥10s, ≤120s). - Partition contract: rows within the grace window are restorable, rows past the grace window are reapable, live rows are neither. - `noteIsLive` is `isNull(notes.deletedAt)` (via module-level mocks). - Reaper authorization: Bearer header + `?secret=` query accepted when correct; wrong/missing credentials rejected; dev allows unsecured calls only when `NODE_ENV !== "production"`. - `tests/bulk-actions.test.ts` — extended with `restore` + `untag` actions in the bulk-action schema. - `tests/vault-read-filter.test.ts` — Drizzle mock extended with `isNull` + `notes.deletedAt` so the vault-leak tests continue to run against the now-combined `noteIsLive ∧ vault-exclude` predicate. ### Verification — all green on this commit - `npm run typecheck` — 0 errors - `npm run lint` — 0 errors, 0 warnings - `npm run voice:check` — OK - `npm test` — 789 passed, 27 skipped (79 files, 0 failures) - `npm run build` — succeeded ### Ops notes - **Migration:** apply `0050_notes_soft_delete_grace` in every environment before the first deploy of this release. - **Cron:** `infra/cloud-scheduler/apply.sh --apply` (reads `jobs.yaml`) registers the reaper with `* * * * *` against the Cloud Run service in `xeso-493019 / us-central1`. The script is idempotent and also prunes orphaned `xeso-*` Scheduler jobs. Set `BILLING_CRON_SECRET` on the apply runner and in the Cloud Run env so the Scheduler-issued Bearer matches what the route expects. - **Manual smoke:** bulk-delete a few notes → hit "Undo" inside 30s → confirm rows return to their segments with `note_count` restored. Delete again, wait >60s → confirm rows disappear on the next reaper tick, and confirm exports / shared answers / digest emails no longer cite them. ---

\[v4.4.0] — 2026-04-17 — "World-class elevation sprint"

A 15-task sprint that closed the remaining "is this world-class?" decision gates after Wave R. Focus: destructive-action safety, leavability, interaction polish, first-party error observability, a real undo system, and the on-ramps (local dev infra, CI guards) that keep the next contributor honest. Every task landed with code + tests - docs; see `docs/ux/WORLD_CLASS_AUDIT.md` for the five founder-level decisions (grace period, strict-grounding default, telemetry retention, bundle scope, undo window) and `docs/product/IMPLEMENTATION_STATUS.md` for the shipped-surface table. ### Added — Safety & trust - **Account deletion grace period (T7)** — `deleteAccountAction` now requires step-up auth _and_ typed-email confirmation, then stamps `users.deletion_scheduled_at` / `_for` / `_confirm_email` (migration `0043`) with a 72 h window. Signing in cancels the deletion automatically and logs `account.delete outcome=cancelled_on_signin`. A new `app/api/cron/account-reaper/route.ts` physically purges users whose timer has elapsed. - **Step-up enforcement on destructive actions (T2)** — `deleteAccountAction` and `/api/export` both call `requireStepUp` before returning data or scheduling a delete. Step-up is passkey-preferred (see `lib/auth/step-up.ts` + K4 of v4.2.0). - **E2E smoke coverage un-skipped (T3)** — `e2e/smoke.spec.ts` no longer carries an unconditional `test.skip(true)`; security-critical assertions (step-up, session-invalidation-on-delete, public-surface 404s) run in CI. ### Added — Leavability & data - **Bundle import (T6)** — `POST /api/import/bundle` round-trips a `xeso-full-export-*.zip` from `/api/export?scope=all`. Restores segments (with parent-child re-mapping), notes (with frontmatter recovery and content-hash dedupe), and saved searches. Chat threads and shared answers are deliberately skipped — see Gate 4 of `docs/ux/WORLD_CLASS_AUDIT.md` for the reasoning. Rate-limited, 100 MB cap, plan-limit aware. - **Search facets + saved-search UI (T5)** — `LibraryFilterBar` adds source-type, tag, and date-range filters that persist in URL params; `SavedSearchesList` lets users bookmark current views. The library search endpoint was extended to accept `dateFrom` / `dateTo`. ### Added — Interaction & motion - **Global undo (T13)** — `UndoManagerProvider` (one entry at a time, ⌘Z / Ctrl+Z binding that respects focus in editable fields) paired with `useUndoableAction` wired into bulk move and bulk tag actions. `/api/notes/bulk` gains a matching `untag` action and `/api/notes` accepts an `ids=` query to snapshot a batch for undo. Bulk flows switched from `window.location.reload()` to `router.refresh()` so the undo toast survives the state update. 5 s window per Gate 5 of the audit doc. - **AI trust drawer (T4)** — retrieval context drawer on chat messages, citation hover previews, and a `UiPrefs.strictGrounding` per-thread toggle that forces citation-backed claims at the prompt level. - **`RecipeSelector` popover polish (T12)** — proper `role="dialog"`, capture-phase key listener so Arrow / Enter / Tab win over the textarea's submit handler while the picker is open, outside-click dismissal on pointerdown. - **Library search loading (T12)** — final `Spinner → SkeletonList` swap on the one remaining site that still used the legacy pattern. ### Added — Mobile & accessibility - **Pull-to-refresh (T8)** — `components/mobile/PullToRefresh.tsx` wraps the library and brain pages; a native-feeling gesture with `prefers-reduced-motion` respect. - **Haptics vocabulary (T8)** — `lib/mobile/haptics.ts` centralises `tap / capture / success / warning` patterns so every surface vibrates with the same meaning. - **Swipe actions hook (T8)** — `lib/mobile/useSwipeActions.ts` for future adoption on row UIs. - **Share target verified (T8)** — `public/manifest.json` + `app/share/page.tsx` tested against Android / iOS Share Sheets. - **Accessibility AAA (T9)** — `resources/custom.css` adds `prefers-contrast: more` (raises text + border contrast), `prefers-reduced-transparency: reduce` (swaps alpha surfaces for solids), and `forced-colors: active` (Windows High Contrast mode defers to system tokens for chrome). ### Added — Performance - **`BrainGraph` code-split (T10)** — `components/brain/BrainGraphLazy.tsx` dynamically imports the graph with `ssr: false`; a skeleton and a route-level `/brain/graph/loading.tsx` cover the gap. - **Bundle analyzer (T10)** — `next.config.js` conditionally loads `@next/bundle-analyzer` behind `ANALYZE=true`; `npm run analyze` generates the client/server reports. ### Added — Observability - **First-party error telemetry (T11)** — `components/observability/ErrorReporter.tsx` captures `window.error` + `unhandledrejection` and POSTs a bounded payload to `/api/errors`, which validates with Zod, digests by (message + top stack frame), and writes to `error_events` (migration `0045`). Anonymous schema — no userId, no IP — same privacy posture as `/api/rum`. Honours Do-Not-Track, caps at 20 events per session, 1 s cool-down. - **Admin metrics dashboard (T11)** — role-gated `app/admin/metrics/page.tsx` renders 24 h / 7 d summaries, hourly trend, top error groups (by digest), top error routes, and the 50 most-recent errors. Cross-links to `/admin/analytics`. Minimum role: `ops`. ### Added — Infra & CI - **Migration ordering guard (T1)** — `scripts/db/check-migration-ordering.mjs` (`npm run db:check`) blocks new duplicate numeric prefixes, asserts the journal ↔ `*.sql` file mapping is 1:1, and fails on empty migrations. Grandfathers the historical `0037_age_gate` / `0037_teams` pair (already applied in production) and refuses new duplicates. Wired into the Quality CI workflow. - **Local dev fast-path (T15)** — `docker-compose.yml` spins up Postgres with `pgvector`; `scripts/bootstrap-local.mjs` (run via `npm run bootstrap:local`) waits for the DB, ensures the `vector` extension, and runs Drizzle migrations. README updated with the one-command setup. ### Decision record (T14) Five founder-level defaults codified in `docs/ux/WORLD_CLASS_AUDIT.md` with rationale, implementation pointers, and the reversible path for each: 1. 72 h account-deletion grace period. 2. Strict grounding default: Off (opt-in). 3. Error telemetry retention: 30 days. 4. Bundle import scope: segments + notes + saved searches (chat / shared-answers skipped). 5. Undo window: 5 seconds. ### Verification - `npm run typecheck` — green. - `npm run lint` — **0 errors, 0 warnings** (fixed one transitive unused-catch binding in `scripts/bootstrap-local.mjs`). - `npm run db:check` — passes. - `npm test` — **712 passed / 27 skipped / 0 failed across 64 files.**

\[v4.3.0] — 2026-04-17 — "Wave R — Advanced"

The P4 / Wave R advanced features that the founder discipline roadmap had parked for post-launch all land here, fully wired end-to-end and feature-flagged so operators can enable them per cohort. The retrieval-quality ceiling, the "share as short video" surface that growth has been waiting on, and the in-browser agentic chat experience are now in the repo in production-shape, not as stubs. - **Drift monitor (Phase 1)** — eval harness (`scripts/evals/run.ts`) pinned to versioned baselines (`scripts/evals/baselines.json`) and integrated with a `prompt_canaries` / `canary_assignments` pair so newly published prompts ship behind deterministic traffic splits and regressions surface in the admin dashboard before any user sees the candidate in full. See `lib/services/evals/canary.ts`. - **Audio Overview (Phase 2)** — "Generate audio overview" on any segment produces a 3-5 minute podcast-style narration using `segment.audio_overview/v001` + OpenAI TTS. Script + opus bytes are cached in `audio_overviews` keyed by a corpus fingerprint so re-opening a segment whose notes haven't changed doesn't re-burn the generation cost. Flag: `XESO_AUDIO_OVERVIEW_ENABLED`. - **Query Decomposition (Phase 3)** — compound chat questions are broken into sub-queries (`lib/services/search/decomposer.ts`, `chat-decompose/v001`), executed against `hybridSearch` in parallel, and merged with Reciprocal Rank Fusion. Flag: `XESO_QUERY_DECOMPOSE`. - **Agentic tool-calling (Phase 4)** — `/api/chat` now optionally exposes a small, sandboxed tool registry (`search_notes`, `open_note`, `list_segments`) to the LLM via the AI SDK's `tool()` helper. Every tool is user-scoped, vault-safe, bounded-output, and rate-limited. Flag: `XESO_AGENTIC_CHAT`. - **Learned retrieval weights (Phase 5)** — each user gets a trained `(w_bm25, w_vec)` mix stored in `retrieval_weights`; the trainer (`lib/services/search/weight-trainer.ts`) does a bounded grid search over historical chat citations every 24 h via the new `/api/cron/retrieval-weights` cron. `hybridSearch` applies the mix inside its RRF formula. Flag: `XESO_LEARNED_WEIGHTS`. - **MP4 share (Phase 6)** — after minting a share link, the user can open a browser-side studio that renders a 9:16 ten-second clip on a `<canvas>` and records it with `MediaRecorder`. Bytes upload to `/api/share/answer/:token/video` (capped at 6 MB, mime allow-listed) and are served back under the same path with `og:video` meta on the share page so X / Slack / Discord auto-play the clip. - **Ops plumbing** — four new feature flags documented in `.env.example`; two new admin APIs (`/api/admin/evals`, `/api/admin/retrieval-weights`) gated by the `ops` RBAC role; two new cron routes (`account-reaper` untouched, `retrieval-weights` new) reusing the existing `BILLING_CRON_SECRET` convention. - **Tests** — 65 new unit-test assertions across `tests/weight-trainer.test.ts`, `tests/query-decompose.test.ts`, `tests/canary-routing.test.ts`, `tests/audio-overview.test.ts`, and `tests/video-share.test.ts`. Suite totals move from 665/27/0 to **730 passed / 27 skipped / 0 failed**.

\[v4.2.6] — 2026-04-15 — "Hardening Tail + Vault-at-Rest Design"

Closes the remaining actionable P1/P2 hardening tail from `docs/security/PLAN_V4_FRONTIER.md` and locks the implementation path for the next major phase (v4.3 vault-at-rest). - **Body-size sweep completed** — aligned mutating routes to `readJsonWithLimit` + explicit `BodyLimits` classes for the remaining outliers: - `POST /api/mcp` now uses `BodyLimits.SMALL` - `POST /api/billing` now uses `BodyLimits.TINY` - `POST /api/teams` now uses `BodyLimits.SMALL` - `PATCH /api/chat/threads/[id]` now uses `BodyLimits.SMALL` - `POST /api/admin/feature-flags` now uses `BodyLimits.SMALL` - **Guardrail test for future regressions** — added `tests/api-body-limit-guardrail.test.ts`, which fails CI on direct `await request.json()` usage in `app/api/**/*.ts` (with explicit webhook exemption). - **Retrieval-poisoning anomaly scorer** — added `lib/services/search/anomaly.ts` with `scorePassageAnomaly(userId, embedding, segmentId?)` and `createPassageAnomalyScorer(...)`. - Algorithm compares passage embeddings to user-global and parent-segment centroids from `segments.centroid`. - Flags outliers with `distance > 0.85`. - `chunkAndEmbedNote` now logs and emits `retrieval_anomaly_detected` for flagged passages and persists summary metadata on the note. - **Eval merge-gate floor** — retained and enforced through `.github/workflows/eval.yml` + `tests/eval-gate-floor.test.ts`; no further code change needed in this release. - **v4.3 design authored** — added `docs/security/VAULT_AT_REST_V4_3_DESIGN.md` defining envelope format, migration phases, enforcement model, rollback strategy, and acceptance tests before shipping vault-at-rest code. ---

\[v4.2.5] — 2026-04-15 — "Vault Read-Path Seal III"

Third-pass audit closing the last five plaintext surfaces identified after `v4.2.4`: review-card mutation responses, the background dream-cycle maintenance loop, stale-note detection, analytics aggregates, and the maintenance dashboard. Every one of these surfaces was either returning vaulted plaintext directly (`question`/`answer`, titles), feeding vaulted plaintext to an external provider (embedding API, entity extractor), or leaking the _shape_ of vaulted activity via counts, top-tags, or weekly-growth widgets. - **Review card update response scrubbed** — `PATCH /api/review-cards/[id]` previously returned the full row via `.returning()`, which includes the raw `question` / `answer` strings. The endpoint now (a) looks up the card's source note, (b) refuses with HTTP 403 `code: "vault_locked"` if that note lives in a vaulted segment, and (c) narrows the `.returning()` projection to only the SM2 scheduling fields (`id`, `interval`, `ease`, `lastResult`, `dueAt`) so even a non-vaulted card can't echo plaintext that wasn't already in the request. - **Dream-cycle vault-aware** — `runDreamCycle` (scheduled via `/api/maintenance` POST) now fetches `getVaultedSegmentIds(userId)` once and threads it through every sub-step: - `fixUnembeddedNotes` and `reembedStalePassages` skip vaulted rows so their plaintext never reaches the embedding provider. - `tagUntaggedNotes` skips vaulted rows so `detectAndTagEntities` cannot write entity names derived from vaulted plaintext into the cross-segment `entities` table (which is then read by team search and entity chips). - `computeEmbeddingCoverage` excludes vaulted rows from its numerator _and_ denominator so the reported coverage doesn't silently fluctuate based on vault state. - **Stale detection drops vaulted titles** — `detectStaleContent` (shared by the dream cycle and `GET /api/maintenance/stale`) used to return vaulted `title`s directly in its report. Now it joins against `getVaultedSegmentIds` and drops those rows before constructing the response. - **Analytics panel sealed** — `GET /api/analytics` used to reveal vaulted activity via five separate channels: - `totalNotes` / `totalPassages` / `embeddingCoverage` / `staleNotes` — all now exclude vaulted segments in the SQL aggregates. - `topTags` previously ranked tags by the pre-aggregated `tags.usage_count` column, which counts every note a tag is applied to including vaulted ones. Rewritten to recompute usage at read-time from `notes.tags` joined with the non-vaulted note set, so a tag that's only used on vaulted notes will not appear on the user's analytics dashboard at all. - `weeklyGrowth` and `sourceBreakdown` — both raw SQL queries now carry a `(n.segment_id IS NULL OR n.segment_id NOT IN (…))` predicate. - **Maintenance dashboard sealed** — `GET /api/maintenance` mirrors the analytics fix above: every counting aggregate (`totalNotes`, `totalPassages`, `embeddingCoverage`, `staleNotes`) excludes vaulted rows. `orphanNotes` (the "not yet filed" counter) intentionally keeps counting orphan-segment rows because they are by definition not vaulted. - **Tests & gates** — added `tests/vault-seal-pass4.test.ts` with 10 unit tests pinning the decision shape for the vault gate (review cards), the exclusion contract (dream cycle + analytics), and the orphan / vault / mixed-batch cases. Typecheck ✔, lint ✔ (0 errors / 0 warnings), **712 unit tests pass (+19 new)**, Next.js production build ✔. Post-pass posture: every _writable_ mutation that reflects plaintext (review-card PATCH) either scrubs its response or refuses behind the vault token; every _aggregated_ read of note counts, tags, growth, coverage, stale titles, or dashboard metrics drops vaulted rows at the SQL layer; every _background_ pipeline (embedding, entity-extraction, maintenance) refuses to feed vaulted plaintext to a third party. Combined with the v4.2.3/v4.2.4 passes, there is now no known session-level read path that surfaces vaulted plaintext or vault-derived shape to a caller who is not holding a valid `X-Vault-Unlock-Token`. The canonical gates are `requireVaultUnlock`, `getVaultedSegmentIds`, and `isSegmentVaulted` — grep for any of the three to find every enforcement point. ---

\[v4.2.4] — 2026-04-15 — "Vault Read-Path Seal II"

Second-pass audit closing every remaining plaintext leak of vaulted notes discovered after `v4.2.3`. Where `v4.2.3` sealed the primary read paths (list, search, chat retrieval, graph, briefing, related, backlinks, card gen, export, bulk move, public share), this release finds the **last mile** — historical views, AI summaries, MCP-tool reads, email digests, team search, review queue, review cards, suggest-segment, version history, and share-answer links — and applies the same two-layer defence: scoped reads require the passphrase, aggregated/LLM-facing reads drop vaulted rows at the SQL layer. - **Version history gate** — `GET` and `POST /api/notes/[id]/versions` now call `requireVaultUnlock` when the parent segment is vaulted. Version snapshots contain full plaintext history; without this gate a stolen session could reconstruct a vaulted note by walking backward. - **Suggest-segment gate + target scrub** — `GET /api/notes/[id]/suggest-segment` requires the token when the source note is vaulted (the endpoint embeds the note content, so leaking it into the embedding pipeline is itself a leak), and filters vaulted segments out of both the SQL fallback and the pgvector similarity results so the UI never recommends moving a note into a vault the user isn't currently unlocked against. - **Segment AI-summary refusal** — `GET /api/segments/[id]/summary` returns `{ summary: null, vaulted: true }` (HTTP 400) instead of piping vaulted plaintext to Gemini. Summaries are rendered in the segment header for everyone with session access; unlocking the vault is a hard prerequisite for any LLM-generated artifact over its contents. - **MCP tools sealed** — `lib/services/mcp/tools.ts` now refuses every tool that could leak vaulted plaintext. MCP bearer tokens cannot verify a passphrase unlock, so vaulted content is **completely invisible** to external LLM agents: - `search` / `search_segment` — exclude vaulted segments at the `hybridSearch` layer; scoped searches targeting a vaulted segment throw `McpToolError(-32004, "vault_locked")`. - `read_note` — 403s when the source note's segment is vaulted. - `find_notes` — aggregates drop vaulted segments; scoped queries refuse with `-32004` when `segmentId` targets a vault. - `cite` — the raw SQL selects `s.vault_enabled` and refuses on a hit so a citation handle never returns vaulted passages. - **Team "Who Knows What?" entity filter** — previously the team search used `entities.mention_count` (pre-aggregated from all notes including vaulted), which could leak entity names originating from vaulted notes to other team members. Rewritten to join `entity_mentions → notes → segments` at read-time and drop mentions from vaulted segments before counting. - **Weekly email digest filter** — `loadDigestData` now drops vaulted notes from the recent-notes window, filters the `connections` raw SQL, and excludes citations from vaulted segments in the "most-cited" CTE. Digests are sent through a third-party SMTP provider and cannot be gated behind a passphrase, so the only safe design is to omit them entirely. - **Library "Recent" & "Resurfaced"** — `getRecentNotes` and `getResurfacedNotes` (the dashboard + library home widgets) now `LEFT JOIN segments` and drop rows where `vault_enabled = true`. Cache keys bumped to `v3` / `v2` respectively so old cached results are invalidated on deploy. - **Bulk tag + delete gate** — `PATCH /api/notes/bulk` `tag` and `delete` actions now call a new `ensureVaultedIdsAllowed` helper: single-vault batches require the token; cross-vault batches are refused with HTTP 409 `code: "vault_mixed_batch"` so the client can split them. Matches the existing `move` gate from v4.2.3. - **Review queue filter** — `GET /api/review-cards` now `INNER JOIN notes` and filters where the source note's segment is vaulted. Cards generated **before** a vault was enabled no longer surface in the review queue once the vault flips on. - **Share answer scrub + public refusal** — `POST /api/share/answer` strips citations and `publicExcerpts` whose notes are vaulted, and if **every** citation is vaulted it refuses with HTTP 403 `code: "vault_refuses_share_answer"`. `GET /a/[token]` 404s defensively if any cited note became vaulted after the share was created, mirroring the `/s/[token]` behaviour from v4.2.3. - **Pre-existing fixes rolled in** — `lib/services/audio/audio-overview.ts`, `app/api/audio-overviews/[id]/route.ts`, and `app/api/segments/[id]/audio-overview/route.ts` had type-check failures that were masking other errors: fixed `reportError` calls to use `extra: { segmentId }` per `ErrorContext`, fixed `readJsonWithLimit` destructuring to handle the `NextResponse` return branch, and fixed the `Buffer → BodyInit` boundary by wrapping in `Uint8Array` before passing to `NextResponse`. - **Tests & gates** — added `tests/vault-seal-pass3.test.ts` (`getVaultedSegmentIds`/`isSegmentVaulted` cross-user isolation, share-answer citation stripping). Typecheck ✔, lint ✔ (0 errors / 0 warnings), 693 unit tests pass (+17 new), Next.js production build ✔. After this pass, every server-side surface that ever reads note plaintext — rendering, searching, summarising, embedding, citing, emailing, reviewing, versioning, sharing, exporting, MCP — either requires `X-Vault-Unlock-Token` or omits vaulted rows at the SQL layer. Grep `getVaultedSegmentIds` / `isSegmentVaulted` / `requireVaultUnlock` for the canonical list of gates. ---

\[v4.2.3] — 2026-04-15 — "Vault Read-Path Seal"

Follow-up pass closing the vault **read-path**. `v4.2.2` gated writes (`POST` / `PATCH` / `DELETE /api/notes`), but many aggregated and scoped reads still surfaced plaintext from vaulted segments without asking for the passphrase. This release makes the vault consistent across every surface that renders note content, passages, or metadata, and folds in three rate-limit gaps and one N+1 hot spot found along the way. - **Central helpers** — new `lib/data/vault.ts` exports `getVaultedSegmentIds(userId)` and `isSegmentVaulted(userId, segmentId)`. Every aggregated read imports the first; every scoped read imports the second. Grep for these to find every gate. - **Aggregated reads now exclude vaulted segments** - `GET /api/notes` — filters `WHERE segment_id NOT IN (vaulted) OR IS NULL`. - `POST /api/search` — both `quickSearchNotes` and `hybridSearch` now accept `excludeSegmentIds`; the generic endpoint refuses a scoped query targeting a vaulted segment. - `POST /api/chat` — vaulted ids flow through `hybridSearch` and `graduatedSearch` (every stage: fast, HyDE, expanded, deep, fallback). Asking chat to ground on a vaulted segment without a token returns an empty context instead of leaking plaintext into the model prompt. - `GET /api/brain/briefing` — widens the recent-notes window, then drops vaulted entries before summarising; never caches plaintext. - `GET /api/brain/graph` — excludes vaulted nodes and edges; hidden titles / wikilinks no longer appear in the graph view. - `GET /api/export` `scope=brain` and `scope=all` — filter vaulted notes, surface a one-line banner in the README and `metadata.json` so the user knows what was skipped and how to export it explicitly. - **Scoped reads now require the vault-unlock token** - `GET /api/notes/[id]/related` — parent vault gate, results also scrubbed of other vaulted-segment candidates. - `GET /api/notes/[id]/backlinks` — parent vault gate + backlink source scrub (a vaulted note's title never leaks into a non-vaulted backlink listing). - `POST /api/notes/[id]/generate-cards` — refuses without the token; flashcards would otherwise be written to `review_cards` in plaintext. - `GET /api/export?scope=segment` (and `scope=note` / `bibtex` when scoped) — token required when the target segment is vaulted. - `PATCH /api/notes/bulk` `move` — target-segment gate mirrors `POST /api/notes` so bulk-move-into-vault can't bypass the passphrase. - **Public share refusal** — `POST /api/notes/[id]/share` 403s with `code: "vault_refuses_share"` when the parent segment is vaulted (a public link outlives the session). `GET /s/[token]` defensively 404s if the parent segment is vaulted at render time, even for pre-v4.2.3 share links that already exist. - **Rate-limit polish** — `GET /api/teams`, `GET /api/teams/[id]/invite`, and `GET /api/account/age` now call `checkRateLimit(userId, "read")`. - **Ephemeral-chat perf** — `shredAllForUser` was making 2N round-trips (one per thread). Rewritten to a single transaction with `inArray`-batched deletes — three round-trips total regardless of thread count. `shredThread` single-thread API unchanged. - **Tests & gates** — added `tests/vault-read-filter.test.ts` (helper coverage + type-level sanity check on the new `excludeSegmentIds` plumbing). Typecheck ✔, lint ✔ (no new errors), 676 unit tests pass (+11 new), Next.js production build ✔ (128 routes). - **Deliberately out of scope** — actual E2E encryption of `segments.encrypted_payload` (multi-day feature, scheduled for v4.3), renaming the duplicate `0037_*` migration tags (would break every existing install), and offline maintenance/dream-cycle loops (not on the request path). Security posture after this pass: every hot-path read of note plaintext (list, search, chat retrieval, graph, briefing, related, backlinks, card gen, export, public share, bulk move) is either explicitly gated behind the vault-unlock token or drops vaulted-segment notes at the SQL layer. The underlying plaintext in `notes.title/summary/content` is the next v4.3 work item; this release reduces the blast radius of a stolen session to zero for vaulted content. ---