Tool UsePersonalHorizontal

Tool Use Stress Test

Function-call scenarios your agent will eventually hit.

Function-call scenarios your agent will eventually hit. Tests tool selection, argument schema adherence, error recovery, and multi-step orchestration. MCP-compatible — reuses the same tool schemas your prod agent uses.

Install in Claude Code See pricing

What's in the box.

MCP-compatible — reuses the same tool schemas your prod agent uses

Catches silent tool-call failures (valid JSON, wrong semantics)

Covers error-recovery loops and back-off behavior

Install

Three commands.
Then receipts.

Install the Pistachio CLI, add the harness as a Claude Code MCP tool, run it against your agent, and get a signed pass/fail report you can drop into a PR or sales deck.

CLI (Claude Code)

zsh

# 1. Install Pistachio CLI
npx @pistachio/cli login

# 2. Add the harness as a Claude Code MCP tool
pistachio harness add tool-use-stress

# 3. Run it from inside Claude Code — get a signed report
pistachio harness run tool-use-stress

SDK (Node)

typescript

import { Pistachio } from "@pistachio/sdk";

const pistachio = new Pistachio({ apiKey: process.env.PISTACHIO_KEY });

const run = await pistachio.harnesses.run("tool-use-stress", {
  endpoint: "https://your-agent.example.com/v1/messages",
});

console.log(run.passRate);
console.log(run.signedReportUrl);

MCP-native

Lives inside Claude Code.

Signed reports

Ed25519 attestation.

Deterministic

Same input, same score.

Harnesses you'll probably also want

Agents

Agent Hygiene

The sanity check every agent should pass before shipping.

RAG

RAG Faithfulness

Catch hallucinations before your users do.

Legal

Legal Citation Accuracy

Real cases. Real holdings. No fabrications.