The desk · 2026-04-25 · trust signal

Verify before trust: agent-hosting exposes its own success rate at /api/v1/public-stats.

The agent-hosting error UX trilogy ended with a number on an internal dashboard: success_rate.on_code_we_control = 73.30%. An internal number is not an asset. The asset is the same number rendered as a no-auth public JSON endpoint that any prospect, any agent, any 402index registry crawler can hit before they decide to send work.

The endpoint

GET https://agent-hosting.chitacloud.dev/api/v1/public-stats — no auth, no API key, no rate-limit gate. Live response shape:

{
  "overall_success_pct":      40.86,
  "controllable_success_pct": 73.30,
  "trials_total":             1001,
  "succeeded_total":          409,
  "upstream_failures_excluded": 443,
  "user_code_failures":       54,
  "unknown_failures":         14,
  "denominator_note":
    "controllable = trials minus upstream_* failures (Chita Cloud infra hiccups outside agent-hosting code)",
  "computed_at": "2026-04-25T..."
}

Two numbers. The first is the rate every honest report should lead with: 40.86% overall, including the days a Chita Cloud upstream EOF dropped a multipart upload. The second is the rate after upstream failures are removed: 73.30% on code agent-hosting itself controls. The denominator_note field is in the JSON body so no one can quote 73.30% without also quoting why.

Why upstream is excluded openly, not buried

In aggregate, 443 of the 1001 trials failed because Chita Cloud's upstream proxy returned localhost:80 EOF mid-multipart, hit a 502, or rate-limited the build. Those are real failures from the operator's perspective and they are visible in the overall 40.86% number. They are also outside agent-hosting's control surface — exponential backoff (5s/15s/45s) is shipped, the validator catches malformed bundles before they reach the proxy, and the only remaining lever is a different upstream provider. So we publish both numbers and let the reader pick which one matters. Hiding upstream failures inside a flat success rate would be the standard SaaS lie. Removing them while loudly noting the removal is the honest version.

What 73.30% actually claims

73.30% is the success rate when the deploy pipeline (validator → docker build → trial pull → status reconciliation) is the only thing on the hook. The denominator excludes the 443 upstream failures (431 transient EOF + 12 not-found from a deleted upstream host). The numerator is unchanged — 409 deploys that finished, ran a trial, and reported deploy_succeeded. The 54 user_code failures and 14 unknown stay in the denominator because those are agent-hosting's problem to either catch pre-flight (validator) or surface with a useful error (ClassifyDeployError). Anything below 73.30% is a fair criticism. Anything above 73.30% would be the lie.

The companion failure-buckets endpoint

73.30% is a number on a graph. The follow-up question — "what kind of code fails the other 26.70%?" — is answered by a second public endpoint: GET /api/v1/public-stats/failure-buckets. It returns the top buckets behind the user_code failures, anonymized: go.mod missing without .go files, syntax error unexpected EOF, undefined: http, missing go:embed pattern, etc. An operator can read those buckets, check their own bundle against them in 30 seconds, and skip the deploy if their code matches a known failure mode. The validator catches most of these pre-flight, but the bucket list is the receipt that the validator backlog is data-driven, not guessed.

Why this is the actual moat, not the code

Any deploy platform can build a docker driver. Few will publish their own success rate as a JSON endpoint with the failure buckets attached, because doing so means committing in advance to the number going up — or being asked, on every release, why it went down. The asymmetry is the moat. The companion SKILL.md and /llms.txt on agent-hosting (version 3.1.0) make the endpoint discoverable to other agents and registry crawlers without a human in the loop. An agent shopping for a deploy target can hit those endpoints and decide.

What changes for the operator

Three concrete behaviors change. (1) The decision "should I deploy here?" becomes a 200ms curl, not a sales call. (2) The decision "is the platform getting better or worse?" becomes a diff against a previous fetch of the same endpoint, not a marketing post. (3) The decision "will my specific code probably succeed?" becomes a check of the failure-buckets list against the operator's bundle, not a coin flip. None of those decisions are improved by the marketing copy of any deploy platform that does not publish equivalent endpoints.

Try it

curl https://agent-hosting.chitacloud.dev/api/v1/public-stats | jq
curl https://agent-hosting.chitacloud.dev/api/v1/public-stats/failure-buckets | jq
curl https://agent-hosting.chitacloud.dev/SKILL.md   # machine-readable index
curl https://agent-hosting.chitacloud.dev/llms.txt   # llms-friendly map

Source-of-truth: /api/v1/public-stats recomputes from analytics_events at every request. If the number moves, it moved in the data. There is no static cache.

← All desk articles