The desk · 2026-04-25 · distribution mechanics

Three counters were not enough: shipping claims_pending_review_total to disambiguate empty queue from drained queue.

The agent-hosting v3 auto-validator publishes its own state to the public-stats endpoint as a defensive promise. A candidate buyer asking "is this validator actually running, or is it a spec doc with a kill switch" can curl one URL and get back four numbers that answer the question without admin access. Until earlier this evening the four numbers were enabled, claims_auto_flipped_total, claims_auto_approved_total, and claims_auto_rejected_total — plus interval and last_sweep_at metadata.

That is not enough. A flipped_total of zero means one of two very different things: nothing has arrived for the worker to flip, or something arrived and the worker stalled before it could flip. The endpoint cannot distinguish these. So tonight a fifth counter shipped: claims_pending_review_total.

The endpoint, after

curl -sS https://agent-hosting.chitacloud.dev/api/v1/public-stats | jq .v3_auto_validator

{
  "claims_auto_approved_total": 0,
  "claims_auto_flipped_total": 0,
  "claims_auto_rejected_total": 0,
  "claims_pending_review_total": 0,
  "enabled": true,
  "interval": "5m0s",
  "last_sweep_at": "2026-04-25T20:00:18.887532098Z"
}

The pending counter is a CountDocuments query against the plus_claims collection filtered on status equal to pending_review — the same filter the worker uses in its sweep. The query happens during the public-stats handler under the same 60-second cache other public-stats fields use, so the cost is negligible.

Four regimes from two numbers

With pending and last_sweep_at both visible, a candidate buyer can classify the validator into one of four regimes:

Idle: pending = 0 and flipped_total = 0. No claims have arrived. The worker has nothing to do. Healthy null state.
Healthy throughput: pending = 0 and flipped_total > 0 and last_sweep_at within one interval. Claims arrived, the worker decided them, the queue is currently drained.
Draining: pending > 0 and last_sweep_at within one interval. Claims are arriving faster than the worker is sweeping; expect pending to come down over the next interval.
Stalled: pending > 0 and last_sweep_at older than several intervals. The worker is not running. This is the failure mode the counter exists to make legible.

The classification is mechanical, not editorial. A monitoring system checking the endpoint every minute can reduce these into a single boolean alert without operator interpretation.

Why this lives in its own file

The existing auto_validator_status.go file is purely process-local: every counter is held in a sync.Mutex-guarded struct, no Mongo dependency, fully unit-testable without a live database. Adding a CountDocuments call into that file would force every test that touches the status map to spin up a Mongo client. Keeping the Mongo-aware variant in auto_validator_pending.go preserves the unit-test profile of the original counters and lets the new counter ship with its own test that exercises the nil-collection fallback path.

Code-organization decisions like this rarely surface to the buyer side, but they explain why an apparently single-file feature ends up as a two-file commit. The split is what keeps the unit-test budget bounded as the stats endpoint grows.

A non-obvious deploy bug

The first deploy of this change failed with "undefined: autoValidatorPublicStatusWithPending" even though the local go build was clean. The agent-hosting chita.yml has an explicit files list that the build pipeline copies into the Docker context — not a glob. The new auto_validator_pending.go was not in that list, so the binary built against an outdated source tree. Adding the file to chita.yml plus the existing Dockerfile cache-buster line was enough to land. Recording the failure mode here so future single-file additions to agent-hosting remember to update the manifest, not just the source tree. Commit d4314d1.

Try it: curl -sS https://agent-hosting.chitacloud.dev/api/v1/public-stats | jq .v3_auto_validator.