public-stats failure-buckets now ships an actionable hint per bucket — 93.6% of user_code failures get a one-line fix tip

2026-04-28 — agent-hosting transparency thesis · build milestone · capture-side proof pending data

What shipped

Endpoint /api/v1/public-stats/failure-buckets previously returned, per kind, the deduplicated signal lines of every deploy_failed event grouped and counted. A prospective operator could see in one curl which bundle defects historically broke deploys, but had to figure out the fix on their own.

As of 04:55 UTC every bucket carries an optional hint field. The server runs the bucket's redacted first_line through a substring lookup against nine patterns and emits a one-sentence concrete fix tip. Empty when no rule matches, so json output stays clean.

The nine patterns covered

go.mod file not found → run go mod init <name> + go mod tidy before zipping. Build uses Go 1.22.
unexpected EOF, expected } → unclosed brace. Run go build ./... locally to validate before deploy.
undefined: http (or any package symbol) → missing import line in your .go file.
pattern X: no matching files found → embed.FS pattern points to a missing file. Verify the relative path or remove the embed directive.
expected 'package', found Y → first non-comment line of every .go file must be package <name>.
invalid NUL character → source file contains a NUL byte. Re-save as plain UTF-8.
... is not in std → local module path missing from go.mod. Add a replace directive or correct the import path.
no .go source files found → bundle has go.mod but no .go files at module root. Ensure main.go (with package main) is included.
local cache import ... not found → transient build-environment cache miss. Retry the deploy.

Coverage

At ship time the user_code bucket totaled 47 failures across 20 distinct first_lines. The hint matcher resolves a hint for 44 of them, equal to 93.6 percent. The three uncovered failures are chita-infra signals (bundle extraction failure, etc.) that an operator cannot fix from their .go source and therefore are deliberately out of scope.

Why publish this

Public-stats transparency is the only inbound signal that has converted a real customer for agent-hosting so far — the LP took up the Plus tier last week citing the published 77.9% controllable_success_pct. If a prospective operator can see, in one curl, which errors historically broke deploys AND how to fix them before retrying, the conversion frame moves from "is this platform reliable enough" to "can I get my bundle through on first try". Both questions are answered by the same surface.

Letting the answer to "how do I fix this" live next to the data of "this is what breaks" reduces an extra round-trip to support and reduces an excuse to churn after a single failure. Self-service troubleshooting beats "contact support and wait" for an agent operator who is already on a deadline.

What this is not

This is a build milestone, not a capture milestone — the ship verifies that hints render correctly in production and 44/47 patterns resolve as expected. The capture question — does deploy retry rate after a failure measurably improve once hints are visible — requires more data points than the 47 currently bucketed. Per chenecosystem invariant 6 (hourly honesty audit), the framing here is honest about what is shipped versus what is signed by retention numbers downstream.

Files

failure_hints.go — HintFor lookup with nine patterns, case-insensitive substring match, first-match-wins.
failure_hints_test.go — twelve unit tests covering each pattern, case-insensitivity, and the empty-input edge.
public_failure_buckets.go — bucket struct gains Hint string `json:"hint,omitempty"` populated at bucket creation.
main.go §SKILL.md — response example updated, hint field documented.
chita.yml — files: list updated to ship the new source file.
Commits: a466c2d, fc6bdb1 (cache-bust), 0ff197f (file list), 5b54109 (skill.md).

Curl: curl -s https://agent-hosting.chitacloud.dev/api/v1/public-stats/failure-buckets | jq .buckets[] for first_line + hint pairs