Developer Tools

Test before
you commit.

A safe sandbox for experimenting with expert agents, comparing models side-by-side, validating guardrails, and previewing workflows — without touching production.

Request Access

Prompt Lab Guardrail Tester Tool Tester Dry Run

Prompt

PROMPT

Review the auth module and suggest improvements

→

Model

MODEL

claude-sonnet-4-6

active turn TURN 1/3

⇒

Tool Calls

file_read

auth/index.ts

↑ 1.2K ↓ 380 · 420ms

grep_search

TODO|FIXME

↑ 2.1K ↓ 740 · 680ms

→

Response

DONE

✓

2.4s

3 turns

4.3K tokens

AGENTIC LOOP TRACE

file_read ↑1.2K ↓380 · 420ms

grep_search ↑2.1K ↓740 · 680ms

(text) ↑3.4K ↓1.1K · 1.3s

① Prompt enters model — the expert agent receives its task and begins turn 1

② Tools called and returned — file_read and grep_search execute; results feed back into the model

③ Final response exits — after 3 turns the model produces its text answer with full token trace

Agent Chat

Interactive conversations with any expert agent. Select your model, choose an expert agent type, and test how it responds to real prompts. Full session history with auto-save.

Prompt Lab

Compare multiple models side-by-side. See word-level diffs between responses. Compare token usage and cost. Find the best model for each use case.

Guardrail Tester

Validate your guardrail rules against real file changes before deploying. See exactly which rules would trigger and why.

Workflow Dry Run

Preview what an expert agent would plan for a given job — without executing. Review the strategy before committing resources.

Experiment safely

Every conversation, comparison, and test is automatically saved. Pick up where you left off, review past experiments, or share sessions with your team. Auto-named from your first message for easy discovery.

Related Features

Expert Agents Multi-Model Support Benchmark Lab

Try the sandbox

Request access to experiment with expert agents in a safe environment.

Request Early Access