Skip to content

Claude/admin console#3

Merged
WW-Andene merged 2 commits intomainfrom
claude/admin-console
Apr 24, 2026
Merged

Claude/admin console#3
WW-Andene merged 2 commits intomainfrom
claude/admin-console

Conversation

@WW-Andene
Copy link
Copy Markdown
Owner

No description provided.

claude added 2 commits April 24, 2026 20:30
Addresses the gap where post-deploy validation required a terminal +
curl + gh CLI. Now everything lives on /admin as real buttons and
live data.

New /admin page (app/admin/page.js):
  - Deploy health panel — /healthz reachability + /api/stats.readiness
    flags rendered as green/red dots: upstash redis, upstash vector,
    groq, cerebras, gemini, fireworks, cron secret.
  - Admin token input (cached in localStorage per-device) for action
    authorization when ADMIN_TOKEN is configured server-side.
  - Action buttons with live output:
      * Run bench (6 scenarios, in-process, ~30-60s)
      * Seed blind-eval from live bench
      * Recompute dead-block skip list
      * Run dialectical audit (reuses /api/dialectical?run=1)
  - Live telemetry snapshot: prompt size, phase timings (waveA/B p95),
    gauntlet pass rate, graph nodes/edges, relational-time multiplier.
  - Blind A/B snapshot: votes, win rate, 95% CI, actually-better flag.
  - APK build card: link to GitHub Actions + manifest/icon preview
    links. APK build remains GitHub-hosted (not runnable in a Vercel
    function).
  - Nav pills to /stats, /retro, /dev, /blind-eval, /memory, /prefs,
    /meet, /, and GitHub Actions.

New admin API endpoints (token-gated if ADMIN_TOKEN is set,
open in dev mode when not):
  - POST /api/admin/bench — runs 6 privacy-mode scenarios against
    self, measures TTFB / bridge latency / prose latency / sidecar
    coverage, returns summary + per-scenario detail.
  - POST /api/admin/recompute-skiplist — forces Step TT's skip-set
    recomputation.
  - POST /api/admin/seed-blind — runs the bench, then submits each
    (gabriella-reply, baseline-stub) pair to /api/blind-eval so the
    voting surface reflects current production output.

Main chat header (/) now has an 'admin' link between 'stats' and
'about'.

Full 20-scenario bench remains at scripts/integration-bench.js /
npm run bench-remote for offline runs budget-aware enough to not
fit in a 60s Vercel function.
Reverts the standalone /admin page from the previous commit — /dev was
already the canonical ops surface (bootstrap training, datasets,
fine-tune jobs, debug logs, token-gated). Shipping /admin alongside
created two admin consoles.

- Deleted app/admin/page.js and the 'admin' link in the main chat
  header.
- Kept the three /api/admin/* action endpoints (bench, seed-blind,
  recompute-skiplist) — they're real endpoints with no UI coupling.
- Extended /dev with a third tab 'Deploy & validate' that renders
  the same cards the deleted /admin had: deploy health dots, run
  bench, seed blind-eval, recompute skip list, run dialectical
  audit, blind A/B snapshot, APK build link, live telemetry.
- DeployTab component inlined at the bottom of app/dev/page.js,
  reusing /dev's css + token conventions. Tab switch fetches
  /api/stats on mount.

One canonical ops entrypoint again.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2df887b5e6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": tokenHeader,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Forward x-admin-token when chaining to bench

POST /api/admin/seed-blind allows auth via either Authorization: Bearer ... or x-admin-token, but the internal call to /api/admin/bench only forwards the Authorization header. If a client uses the supported x-admin-token path, tokenHeader is empty and the bench request 401s, causing seed-blind to fail with bench failed. Forwarding both auth headers (or sharing the same auth context) avoids breaking one advertised auth mode.

Useful? React with 👍 / 👎.

opener: lastUser(r),
category: r.category,
},
a: { source: "gabriella-live", text: r.replyPreview },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Submit full bench reply to blind-eval pairs

This route submits r.replyPreview as Gabriella’s candidate text, but the bench producer truncates that field to 200 characters. For any longer response, blind-eval stores a clipped answer instead of the real model output, which can skew pairwise voting results and downstream win-rate stats. Use the full bench reply text (or add a non-truncated field) when building a.text.

Useful? React with 👍 / 👎.

@WW-Andene WW-Andene merged commit 4226c18 into main Apr 24, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants