Scroll to explore
Scroll to explore
xPayMind is an open evaluation framework for measuring AI agent performance against the x402 HTTP payment protocol. Agents are tested across latency, correctness, retry logic, and protocol conformance — producing reproducible, comparable scores across model families and implementations.
The x402 protocol defines a machine-readable payment negotiation layer for HTTP. As AI agents increasingly transact autonomously, rigorous benchmarking becomes infrastructure — not an option. xPayMind provides the tooling and public record to hold agents accountable to that standard.
To establish the universal performance standard for AI agents operating within the x402 payment protocol — ensuring that autonomous economic agents are measurable, comparable, and accountable to a shared technical baseline. Every agent deserves a score. Every score deserves to be reproducible.
As AI agents begin transacting autonomously over HTTP, no agreed standard exists to evaluate whether they correctly implement x402. Developers cannot compare implementations. Auditors have no reproducible benchmarks. Users have no transparency into agent payment behavior. Without infrastructure for measurement, trust in autonomous economic agents cannot scale. xPayMind closes this gap.
Each agent submission is evaluated against a fixed pipeline of 18 automated test steps grouped into 5 phases. Steps run in order — a failure in an earlier phase is visible in the row. All scores are averages across recorded runs.
Open-source frameworks and reference implementations with documented x402 HTTP payment protocol support. Links go directly to source repositories.
x402 extends the dormant HTTP 402 status code into a complete machine-to-machine payment protocol — enabling AI agents to autonomously negotiate, authorize, and complete micropayments without human intervention.
// Agent requests a paid resource
GET /api/premium-data HTTP/1.1
Host: api.example.com
HTTP/1.1 402 Payment Required
X-Payment-Scheme: x402/1.0
X-Payment-Recipient: 0x4f9...a23
X-Payment-Amount: 0.001 USDC
X-Payment-Network: base
// Agent pays, retries with proof
GET /api/premium-data HTTP/1.1
X-Payment-Payload: eyJhb...
HTTP/1.1 200 OK
// Resource delivered
Install the CLI, register your agent endpoint, and get a full 18-step benchmark result in under 5 minutes.
Requires Node.js ≥ 18. The CLI communicates with the xPayMind API at api.xpaymind.io/v1.
$ npm install -g @xpaymind/cli
$ xpaymind --version
xpaymind/0.4.2 linux-x64 node-v20.11.0
Your agent must expose an HTTP endpoint. The runner sends live x402 payment requests to this URL during the benchmark.
$ xpaymind agent register \
--name "my-agent" \
--endpoint "https://my-agent.example.com" \
--network base
✓ Agent registered. ID: agt_8f2a9c1e
✓ Endpoint reachable (HTTP 200)
API key saved to ~/.xpaymind/config.toml
Executes all 18 steps sequentially. Estimated runtime: 90–240s depending on agent latency. Results are streamed and stored.
$ xpaymind run --agent agt_8f2a9c1e --suite full
Running phase 1/5: CORE PROTOCOL...
[01] Payment Initiation ✓ 94ms
[02] 402 Header Parsing ✓ 88ms
[03] Payload Construction ✓ 121ms
...
Run ID: run_4d91e2bc
✓ Score: 86.4 / 100 (18/18 steps completed)
xPayMind is a stateless test runner backed by a persistent results store. Each benchmark run is isolated inside a short-lived job worker.
The test runner makes real outbound HTTP requests to your agent endpoint. There is no mocking. Payment transactions are executed on the Base testnet by default (mainnet opt-in available). Each step is timed independently; timeout thresholds are defined per-step in the benchmark spec.
Results are immutable once written. Each run gets a deterministic run_id derived from sha256(agent_id + suite + timestamp) to prevent replay attacks on the results API.
Your agent must correctly handle an inbound HTTP 402 response and retry the request with a valid x402 payment payload. Below is the minimum required interface.
# Server returns 402 with payment instructions
HTTP/1.1 402 Payment Required
WWW-Authenticate: x402 realm="api.example.com"
X-Payment-Scheme: x402/1.0
X-Payment-Recipient: 0x4f9c8a2b...a23f
X-Payment-Amount: 0.001
X-Payment-Asset: USDC
X-Payment-Network: base
X-Payment-Expires: 1747152000
{
"scheme": "x402/1.0",
"network": "base",
"asset": "USDC",
"amount": "0.001",
"recipient": "0x4f9c8a2b...a23f",
"payer": "0x9a1b3c4d...88e1",
"nonce": "a1b2c3d4e5f6",
"expires_at": 1747152000,
"signature": "0x3d4e..."
}
GET /api/resource HTTP/1.1
Host: api.example.com
X-Payment-Payload: eyJzY2hlbWUiOiJ4NDAyLzEuMCIsIm5ldHdvcmsiO...
# base64url-encoded JSON payment payload above
The signature field must be an EIP-712 typed signature over the canonical payment struct. xPayMind validates signatures on-chain during steps 14–16. Agents that submit unsigned or self-signed payloads will fail the security phase.
Base URL: https://api.xpaymind.io/v1. All endpoints require a Bearer token. Responses are JSON. Rate limit: 60 req/min per API key.
/v1/runs
Submit a benchmark run
{
"agent_id": "agt_8f2a9c1e",
"suite": "full", // "full" | "core" | "security"
"network": "base-sepolia", // testnet default
"timeout_ms": 5000,
"notify_webhook": "https://your-server.com/hook"
}
{
"run_id": "run_4d91e2bc",
"status": "queued",
"estimated_duration_s": 120,
"results_url": "https://api.xpaymind.io/v1/runs/run_4d91e2bc"
}
/v1/runs/{run_id}
Retrieve run results
{
"run_id": "run_4d91e2bc",
"agent_id": "agt_8f2a9c1e",
"status": "complete",
"score": 86.4,
"steps": [
{ "id": 1, "name": "Payment Initiation", "passed": true, "latency_ms": 94 },
{ "id": 2, "name": "402 Header Parsing", "passed": true, "latency_ms": 88 },
{ "id": 10,"name": "Partial Payment", "passed": false, "latency_ms": 3012,
"error": "TIMEOUT: agent did not retry within 3000ms" }
],
"phase_scores": {
"core": 92.0, "resilience": 68.5, "performance": 81.3,
"security": 95.1, "compliance": 83.0
},
"completed_at": "2025-05-13T14:22:10Z"
}
The final score is a weighted average across 5 phases. Phase weights reflect the relative importance of each capability category for production x402 deployments.
Each step produces a binary pass/fail, optionally modified by a latency multiplier. If a step passes but exceeds its latency threshold, the step score is scaled by min(1, threshold_ms / actual_ms). Steps that hard-fail (wrong response, timeout, invalid signature) score 0 regardless of latency.
Phase score = mean of step scores within phase. Final score = sum of (phase_score × phase_weight) across all phases, multiplied by 100.
The full benchmark runner is open-source. Run it locally against your agent before submitting to the public registry, or deploy it to your own infrastructure.
# Clone and configure
$ git clone https://github.com/xpaymind/runner
$ cd runner
$ cp .env.example .env
# Edit .env: set AGENT_ENDPOINT, NETWORK, RPC_URL
# Run with Docker
$ docker compose up --build
→ API listening on :8080
→ Runner worker started (concurrency: 4)
→ Redis connected at redis:6379
AGENT_ENDPOINTRequired. Base URL of the agent under test.NETWORKbase-sepolia (default) or base for mainnet.RPC_URLEVM JSON-RPC endpoint used for on-chain signature verification.STEP_TIMEOUT_MSPer-step timeout in ms. Default: 5000.PAYER_PRIVATE_KEYEVM private key for the test wallet that funds payment steps.LOG_LEVELinfo | debug. Debug logs include full request/response bodies.GET /v1/runs/{id}/stream.ghcr.io/xpaymind/runner:0.3.1.No presale. No private round. No VC allocation. No team tokens. xPayMind launches with full transparency — every token enters circulation through protocol participation and open market distribution. There is no privileged access, no lockup advantage, and no insider supply.
No famous names, no big logos. A backend engineer who wrote the first test suite, an ML engineer who cares about rigorous evals, and a product person who makes sure it's actually usable. That's the team.