Project Overview: devrel-eval-worker


I decided to lock devrel-eval-worker at the v0 ingestion milestone: ingestion is functionally complete, scope is intentionally frozen, and the next work should build on deterministic storage behavior rather than expanding API surface area.

What We Built

  • A Cloudflare Workers API focused on session-based ingestion, deterministic chunk canonicalization + hashing, idempotent deduplication, and deferred evaluation persistence.
  • A TypeScript/Workers codebase with clear domain boundaries across ingestion, canonicalization, hashing, rubric/scoring, and DB access (src/ modules called out in the inventory).
  • Operational scripts for local development, deploys, D1 migration management, and user-centric review ingestion (npm run review:user -- <github-username>).
  • Baseline CI that runs npm ci and npm test on main pushes and pull requests, keeping the ingestion foundation verifiable before broader evaluation logic changes.

Why We Built It

  • I optimized for determinism first: if ingestion and dedupe are stable and repeatable, downstream evaluation can be improved safely without reworking storage semantics.
  • I kept infrastructure intentionally light (Workers + D1 + scripts) so agents and humans can run the same workflow quickly and with minimal operational overhead.
  • I treated this milestone as a handoff point: README guidance, migration order, and agent operations docs establish a reliable path for follow-on work instead of reopening core ingestion decisions.
  • Evidence from the latest session aligns with this direction: the primary theme was building the worker end-to-end around ingestion and evaluation flow, and there are no newer commits/signals indicating a scope change.

How It Works

  • The service runs as a Workers API (wrangler dev / wrangler deploy) with D1-backed state and explicit migration commands for local-first validation, then remote apply.
  • Ingestion is session-oriented and dedupe-aware: content is normalized/canonicalized, hashed deterministically, and stored idempotently so retries do not create divergent records.
  • Evaluation is deferred, not inlined into ingestion writes, which keeps ingestion predictable and lets scoring/rubric logic evolve with less risk to ingestion correctness.
  • Team usage is codified in docs: docs/agent-operations.md for queued/automated agent flow and docs/v1-handoff.md for the next implementation phase.