This documentation is available as Markdown. For the complete index, see llms.txt. Skip to content

MCP tools

For the complete documentation index, see llms.txt

unotest is an MCP server. Your editor’s agent calls these tools to explore your app, run scenarios, debug failures and write tests. You never wire them by hand.

Core lifecycle & snapshots

Open/close the browser and read what the agent “sees” — a semantic outline, not pixels.

  • new_context — Open a fresh, isolated browser context.
  • close_context — Tear down the active context.
  • get_url — URL of the active page.
  • get_title — Document title of the active page.
  • get_page_snapshot — Compact, region-grouped semantic outline of the page (with [ref=eN] handles).
  • get_aria_snapshot — YAML ARIA tree; same-origin iframes stitched in.
  • get_frame_snapshot { locator } — Outline scoped to one iframe.
  • find_element { role, name?, near?, nth? } — Targeted role+name search (≤20 hits + total count). Pierces open shadow roots.

Exploration & recording

The agent drives the live app and records a scenario.

  • explore_start { scenario_name, title?, description? } — Begin a recording session; returns available variables and flows.
  • explore_stop { explorationId } — End recording.
  • explore_state { explorationId } — Inspect the session and recorded entries.
  • explore_step { action, locator?, value?, ... } — Execute one action — record it (with a session) or run ad-hoc.
  • explore_record { action, section, description, ... } — Record an action with mandatory section + intent.
  • explore_remove_step { explorationId, entryId } — Delete a recorded entry.
  • generate_dsl_from_exploration { explorationId } — Emit DSL (+ warnings) from the session.
  • save_exploration_as_test { explorationId, scenarioName? } — Write the scenario to disk; extract flow: steps into helpers.
  • explore_run_flow { explorationId, name, ... } — Replay a saved flow to seed state, recorded as one step.

Debugger & runtime

Run a scenario through the sandboxed engine and step through it.

  • run_test { scenario, browsers? } — Run a scenario; returns a runtimeId.
  • step { runtimeId } — Execute one step of a paused runtime.
  • resume { runtimeId } — Resume to completion or the next breakpoint.
  • inspect_runtime { runtimeId } — Variables, call stack and last event at the pause point.
  • abort_runtime { runtimeId } — Abort with clean teardown.
  • list_runtimes — All active runtimes + status.

Multi-context

Tabs and frames.

  • list_pages — All open tabs (index + URL).
  • list_frames — The frame stack.
  • get_active_context — Which tab is active.
  • switch_page { index } — Switch the active tab.

Failure artifacts

Read the evidence bundled when a run fails.

  • list_failures — All failure bundles with timestamps.
  • get_failure_trace { runId } — DSL execution trace.
  • get_failure_console { runId } — Browser console logs.
  • get_failure_a11y { runId } — ARIA tree at the failure (YAML).
  • get_failure_screenshot { runId } — Screenshot at the failure.
  • get_failure_network { runId } — HAR (only if tier3.network is enabled).

Agent self-service & viewer

Diagnostics, the human-in-the-loop fix flow, and the viewer.

  • agent_fix { location, fix } — Composes failure context for a fix. No LLM call, no auto-apply — you review the diff.
  • get_last_mcp_log — JSONL debug log of the last run (under UNOTEST_DEBUG).
  • audit_last_run — Deterministic rubric audit of the run against the rules (no LLM).
  • open_viewer — Launch (or reuse) the local viewer; returns its URL.