Install & CLI

Up and running in one command.

Calma is one self-contained folder — pure Python standard library, no dependencies, Python 3.9+. Install it as a Claude Code plugin, drop it into any project that reads SKILL.md, or put the calma CLI on your PATH. macOS is first-class (verified sandbox, proven by a built-in self-test); Linux runs with reduced isolation and says so in the ledger; Windows is unsupported.

01

Claude Code plugin recommended

Installs the skill and the zero-touch Stop-hook guardrail — it watches your agent's final message for checkable numbers and re-executes the work to confirm them before the turn finishes. Invisible until a number doesn't hold.

claude code
/plugin marketplace add rikhinkavuru/calma
/plugin install calma@calma
02

Drop into any project

Works with every agent that reads SKILL.md — Claude Code, Codex, Cursor, and more. No build step, nothing to configure.

bash
git clone https://github.com/rikhinkavuru/calma
cp -r calma/.claude/skills/calma your-project/.claude/skills/
03

Plain CLI

Pure stdlib — no pip, no deps. install.sh symlinks bin/calma onto your PATH and prints a hint if needed.

bash
cd calma # the repo you just cloned
./install.sh # or: make install
calma demo
The CLI

Every command, one binary.

The same engine the skill calls, on your terminal. Verify a result, batch a whole sprint, sign and timestamp a verdict, or audit the public registry — all offline, all from calma.

calma --help
calma demo # zero-setup: catch a bundled real inflated backtest (offline, seconds)
calma verify <folder> "<claim>" # check a result against a claim (exit codes below)
calma verify <folder> # no claim: just check the result reproduces
calma batch <dir>... | --manifest m.tsv # verify MANY results + one summary table (CI/sprint)
calma recipes # the 500 built-in metrics, grouped by family
calma verify ... --json # machine-readable verdict (for agents / CI)
calma verify ... --check-determinism # run twice; flaky outputs can't confirm anything
calma verify ... --timeout 300 # raise the re-execution budget (default 120s)
calma verify ... --trust third-party # counterparty code: refuse unless a sandbox tier is live
calma teardown <folder> "<claim>" # shareable "claimed X -> really Y" card (+ --svg)
calma replay <run_dir> # re-run a saved verification; exit 0 iff it reproduces
calma stats <folder> # verification history: catches, hook activity
calma seal <run_dir> [--publish] # sign + RFC-3161 timestamp + counterparty instructions
calma attest keygen # one-time signing key; after this every verify is signed
calma attest verify <bundle> # check a signed bundle, fully offline
calma registry verify [dir] # audit the public catch-history chain offline
Exit codes (calma verify)
  • 0clean — CONFIRMED / CONFIRMED-WITH-CAVEATS
  • 1not clean — REFUTED / MIXED / CAN'T-CONFIRM
  • 2bad input — missing target, malformed contract, unknown --metric
  • 3refused — execution declined (e.g. third-party code, no verified sandbox)
  • 4killed — the re-execution exceeded the --timeout budget

prove the sandbox on your machine:
python3 .claude/skills/calma/scripts/run_hermetic.py doctor

Opt out of the guardrail any time: CALMA_HOOK=0, touch .calma/hook-off, or .calma/config.json → {"hook": {"enabled": false}}. Every decision is logged to .calma/auto_history.jsonl.

Full docs and source on GitHub — and the 500 recipes are listed on the recipes page.