# unotest documentation — full text > AI-native E2E testing for web and iOS. This file concatenates every docs page as Markdown. --- # How it works > The architecture behind unotest — MCP, semantic perception, a sandboxed engine, and local-first execution. unotest connects your editor's AI agent to your real application through an **MCP server**, and turns what the agent does into a reviewable test. ## The loop 1. **You describe a flow** in plain English to your agent. 2. **The agent explores** your live app through ~37 MCP tools — reading a semantic snapshot, clicking, filling, recording each action. 3. **It writes a scenario** to `unotest/e2e/.js` with stable selectors and `step("intent", …)` labels. 4. **It runs the scenario** through the sandboxed engine and, on failure, pauses to inspect, patch and resume. 5. **You review and commit** the `.js`. ## Semantic perception, not pixels The agent doesn't look at screenshots. It reads a **semantic snapshot** of the page (web) or the **accessibility tree** (iOS) — roles, names, labels, test IDs, rendered as a token-cheap text outline. This is cheaper, more reliable, and stable across visual redesigns. ## A sandboxed engine Scenarios are plain JavaScript, but they don't run in Node. They execute in a **sandboxed AST interpreter** — no `require`, no `fetch`, no filesystem, no network except the typed `apiCall` helper. That's why AI-generated tests are safe to run blindly. ## Stable by construction Selectors follow a strict priority — `getByTestId → getByRole → getByLabel → getByText → locator(css)` — and the linter flags brittle patterns. Tests survive refactors because they target meaning, not markup. ## Local-first The MCP server, the browser/Simulator, the runner and the viewer all run on your machine. Your app never leaves it. No cloud, no account. ## The ecosystem | Package | Role | | --- | --- | | `@unotest/web` | CLI · MCP server · runner (web) | | `@unotest/mobile` | CLI · MCP server · runner (iOS) | | `@unotest/viewer` | local results browser | | `@unotest/dsl` | scenario parser + validator | | `@unotest/protocol` | shared types | --- # Install & setup > Requirements, browser choices, and wiring your editor to unotest over MCP. ## Requirements - **Node 20+** (for `npx`; you don't need a Node project). - **Web:** any OS. **iOS:** macOS with Xcode + the iOS Simulator. ## Set up a project ```sh # web npx @unotest/web init # iOS npx @unotest/mobile install /path/to/Your.app --update-env ``` `init` / `install` scaffold the `unotest/` layout, write `unotest.config.*`, and wire MCP config (`.mcp.json` / editor settings) so your agent finds the server automatically. ## Choosing a browser (web) During `init` you pick how Chromium is provided: - **System Chrome / Edge** — zero download. Sets `channel: "chrome"` (or `"msedge"`) in your config. - **Bundled Chromium** — Playwright's build (~150 MB). Run `npx @unotest/web install-chromium` if you didn't during init. You can also run **Firefox** and **WebKit** — set `browsers` in [config](/reference/config/). Single browser in dev; CI can run all three. ## Wire your editor unotest is an MCP server. After `init`, these editors pick it up automatically: - **Claude Code** — via `.mcp.json` and project settings. - **Cursor** / **Codex** — via their MCP config. See [Connect your editor](/agent/connect/) for details. ## Verify ```sh # web npx @unotest/web e2e welcome # iOS npx @unotest/mobile doctor ``` If something's off, the CLI prints a single actionable line — no stack traces in normal operation. Set `UNOTEST_DEBUG=1` for full diagnostics. --- # Introduction > What unotest is — AI-native E2E testing where your agent writes the tests and you stay in control. **unotest is AI-native end-to-end testing.** You don't write the tests — your AI agent does, by driving your real app through an MCP server. You review the result and commit it. It's not a black box. Everything the agent does ends up as ordinary, human-readable `.js` in your repository, with stable selectors and plain-English step labels. You can open it, understand it, and rewrite it by hand at any time. **AI does the work. You keep control.** ## Two surfaces, one idea | | **unotest web** | **unotest mobile** | | --- | --- | --- | | Target | Web apps | iOS apps (React Native + native Swift) | | Engine | Playwright / Chromium | WebDriverAgent / XCUI | | Perception | semantic DOM snapshots | accessibility tree | | Runs on | any OS | macOS only (Apple licensing) | Both share the same DNA: an MCP server, a sandboxed JavaScript DSL, plain `.js` tests in your repo, a pause-on-failure debugger, and local-first execution. ## Why it's different - **The agent perceives structure, not pixels.** It reads a semantic snapshot (roles, labels, test IDs) — cheap on tokens and stable across redesigns. - **Stable selectors by default.** `getByTestId → getByRole → getByLabel → getByText → locator(css)`. The linter steers away from brittle CSS. - **Self-healing, with you in the loop.** A failing step pauses mid-run; the agent inspects, patches and resumes. `agent_fix` never calls an LLM itself and never applies a patch silently — you approve the diff. - **Safe to run blindly.** Scenarios execute in a sandboxed AST interpreter — no `require`, no `fetch`, no filesystem. - **Local-first.** Everything runs on your machine. No cloud, no account. ## Next - [Quick start — web](/start/quickstart-web/) - [Quick start — iOS](/start/quickstart-ios/) - [How it works](/start/how-it-works/) --- # Quick start — iOS > From a built .app to a passing E2E test on the iOS Simulator, written by your agent. `unotest mobile` drives your iOS Simulator and writes real test scenarios for your app — React Native / Expo or native Swift. Works with any backend stack; no JS expertise required from you. :::note[macOS only] iOS simulators can't run on Linux/Windows (Apple licensing). You need macOS with Xcode + the iOS Simulator, and Node 20+. ::: ## 1. Point it at your built `.app` ```sh npx @unotest/mobile install /path/to/Your.app --update-env ``` This installs the app on the Simulator and auto-detects the bundle ID, URL scheme and required permissions from `Info.plist`. On first run it offers to bootstrap `unotest/` and `.mcp.json`. Common `.app` locations: - **React Native / Expo:** `./ios/build/Build/Products/Debug-iphonesimulator/.app` after `npx expo run:ios`. - **Native Swift:** Xcode → Product → Show Build Folder → `Products/Debug-iphonesimulator/`. :::caution[First run compiles WebDriverAgent] The first run builds WebDriverAgent once (~5–15 min) and caches it under `~/.cache/unotest/mobile/wda/`. Subsequent runs reuse it. ::: ## 2. Ask your agent for a test Open Claude Code (or Cursor / Codex) **in the same directory** and ask for a test. The agent drives the Simulator over MCP, reading the live accessibility tree — not screenshots — and records a clean `.js` scenario. ## 3. Run it ```sh npx @unotest/mobile e2e ``` Scenarios are plain `.js` in `unotest/e2e/` — into git, code review and CI. ## Next - [How it works](/start/how-it-works/) - [CLI — mobile](/reference/cli-mobile/) --- # Quick start — web > From zero to a passing E2E test in your web app, written by your agent. Get your first test written and running in a few minutes. Works with any backend stack (Node, Django, Rails, Go…) — no `package.json` required. ## 1. Initialize Run once in your project: ```sh npx @unotest/web init ``` `init` writes the config and a starter scenario, sets up the browser (bundled Chromium, or your system Chrome if you pick it), and wires the MCP server so Claude Code / Cursor / Codex pick it up automatically. Re-run anytime — it never overwrites your edits. ## 2. Open the viewer ```sh npx @unotest/web viewer ``` A local IDE-style UI for your tests — no cloud, no account. Browse scenarios, run them, and watch each step live. This is your home base. ## 3. Ask your agent for a test In your AI editor, open your project and describe the flow in plain English: > Open my app, sign in as the demo user, and check that the dashboard heading > appears. Through the MCP server the agent explores your live app — clicking, filling, reading the real DOM — then writes a clean scenario to `unotest/e2e/.js` with stable selectors, runs it, and debugs itself if it fails. ```js function test_login() { step("Sign in as the demo user", () => { goto("/login"); fill(getByLabel("Email"), TEST_USER_EMAIL); fill(getByLabel("Password"), TEST_PASSWORD); click(getByRole("button", { name: "Continue" })); }); step("Dashboard is shown", () => { assertVisible(getByRole("heading", { name: "Dashboard" })); }); } ``` ## 4. Run it Click any scenario in the viewer, or from the CLI: ```sh npx @unotest/web e2e login ``` ## 5. Review & commit You get a reviewable `.js` test. Read it, tweak it, commit it. That's the whole loop — the agent does the legwork, you own the result. :::tip Point your agent at its authoring guide — see [For your AI agent](/agent/overview/). ::: ## Next - [Concepts: scenarios & `step()`](/concepts/scenarios/) - [Write your first test (guided)](/guides/first-test/) - [The viewer](/viewer/overview/) --- # Collections > Group scenarios and run them as a set — smoke, regress, and more. A **collection** is a named set of scenarios you run together — typically `smoke` (fast, key flows) and `regress` (the full sweep). ## Run a collection ```sh npx @unotest/web collection smoke --workers=4 ``` | Flag | Effect | | --- | --- | | `--workers=N` | run N scenarios in parallel (default: serial) | | `--bail` | stop after the first failure | | `--headed` | run with a visible browser | ## In the viewer The [viewer](/viewer/overview/) manages collections visually: create, rename, reorder by drag, and run the whole set. During a run you see **per-scenario status**, a progress bar, and a one-click **abort**. ## In CI Collections are the natural CI unit — run `smoke` on every push, `regress` nightly. See [Run in CI](/guides/ci/). --- # Debugging & self-healing > One pause-on-failure debugger for both the agent and you — and why fixes are never silent. When a step fails, the run **freezes on that step** instead of crashing. The same paused state is available to both the agent and you. ## The agent's loop 1. The step throws → the runtime pauses. 2. The agent calls `inspect_runtime` — live DOM, variables, the last event. 3. It patches the scenario and calls `resume` from the same step — no browser restart, no re-running setup. ## Your loop (the viewer) In the [viewer's debugger](/viewer/debugger/) you do the same visually: set gutter breakpoints, run paused, inspect on each stop (**Vars / Call stack / Trace / Breakpoints**), and step with **Continue / Step / Stop**. ```sh npx @unotest/web e2e checkout --debug ``` ## Self-healing is agent-assisted, never silent `agent_fix` composes the failure context and a suggested fix — but it **does not call an LLM itself and never applies a patch on its own**. The agent forms a diff; **you review and commit**. That's the core balance: AI repairs, a human approves. Tests are never changed behind your back. --- # Failure bundles > The evidence captured when a run fails — screenshots, console, semantic DOM, and trace. When a run fails, unotest writes a **failure bundle** — everything needed to understand what happened, for you or the agent. ## What's captured | Tier | Contents | Default | | --- | --- | --- | | **1** | error + stack, console log, semantic DOM snapshot, DSL trace | always on | | **2** | screenshot (viewport + element-focused) | on | | **3** | HAR (network) + video | off — **not wired yet** | :::caution[Tier 3 is not shipped] Network HAR and video capture aren't wired yet. Use the console + screenshot + semantic DOM + trace. Don't promise HAR/video in your own docs or dashboards. ::: ## Where it lives Bundles are written under `.unotest/failures/` (configurable), with retention (default: keep 20 runs / 7 days). The [viewer](/viewer/inspector/) renders each artifact per run; the agent reads them through the [failure-artifact MCP tools](/reference/mcp-tools/). ## How the agent uses it On failure the agent calls `agent_fix`, which bundles the trace, console, semantic snapshot and scenario source plus a classification (`rewrite-selector` / `add-waitfor` / `change-assertion`). It proposes a diff — [you approve it](/concepts/debugging/). --- # Helpers — flows & mocks > Reusable JavaScript functions for repeated journeys and for seeding data. Helpers are ordinary JS functions in `unotest/e2e/_helpers/`. There are two kinds, and keeping them separate keeps scenarios clean. ## Flows — replay a UI journey `flow_*` functions wrap a repeated user journey: sign-in, checkout, onboarding. Write once, call from any scenario. ```js // _helpers/flows.js function flow_signin(email, password) { goto("/login"); fill(getByLabel("Email"), email); fill(getByLabel("Password"), password); click(getByRole("button", { name: "Sign in" })); } ``` An agent can also replay a flow live (`explore_run_flow`) to seed state, then record a new test on top of it. ## Mocks — seed & reset data Use the sandbox helpers to put the app into a known state before a test and clean up after. Connection details are pinned in [config](/reference/config/) — a scenario can't redirect them. ```js // _helpers/mocks.js function seed_cart(userId) { dbExec("INSERT INTO carts (uid) VALUES ($1)", userId); apiCall("POST", "/test/checkout/reset"); } ``` | Helper | Use | | --- | --- | | `dbQuery` / `dbExec` | parameterized SQL (postgres / mysql / sqlite) | | `apiCall` | relative-path HTTP against `apiBaseUrl` | | `shell` | run a binary (execFile, no shell interpretation) | :::tip[Keep project knowledge in helpers] The core stays project-agnostic. Anything specific to *your* app — seed scripts, fixtures, endpoints — lives in `_helpers/`, never in the tool. ::: --- # Scenarios & step() > How a unotest scenario is structured — readable intent on top, precise DSL underneath. A scenario is a plain `.js` file in `unotest/e2e/`. It exports `test_*` functions, and **every executable step lives inside a `step()` block**. ```js function test_checkout() { step("Add the first product to the cart", () => { goto("/products"); click(getByRole("button", { name: "Add to cart" })); }); step("Cart shows one item", () => { assertText(getByTestId("cart-count"), "1"); }); } ``` ## Two layers Each `step()` carries two layers at once: - **Intent** — the string label, plain English. Reads like a checklist, even to a non-engineer. - **Execution** — the DSL calls inside the closure. One step can be several calls. This is why repair is precise: the agent knows **what** a step is meant to do (its label) and **how** it does it (the calls), so it rewrites only the broken part — it doesn't guess. The same duality helps you: collapse a step to see the logic, expand it to see the exact commands. :::note[step() is required] The validator requires every direct child of a `test_*` body to be a `step()` call. Helpers (`flow_*`, `snake_case`) are exempt. The older `//@collapse` comment form has been removed. ::: ## The DSL in one breath Navigation (`goto`, `waitForUrl`), locators by stability (`getByTestId` → `getByRole` → `getByLabel` → `getByText` → `locator`), actions (`click`, `fill`, `press`, `selectOption`…), assertions (`assertText`, `assertVisible`…), chaining (`getByRole(...).filter(...).first()`), multi-tab and iframes, and sandbox helpers (`dbQuery`, `apiCall`, `shell`). See the [DSL reference](/reference/dsl/). ## What's not in the DSL Comparison/logical operators in conditions aren't supported — use bare truthy variables. Loops and branching are plain JS *around* steps. Regex literals are allowed in matcher args (ES5 flags only). See [DSL → Not supported](/reference/dsl/). --- # Stable selectors > The selector priority that keeps tests resilient, and the linter that enforces it. Tests break when they target markup that changes. unotest steers every locator toward **meaning** over **structure**, so tests survive refactors. ## The priority 1. **`getByTestId(id)`** — an explicit `data-testid`. Most stable. Prefer it. 2. **`getByRole(role, { name })`** — semantic role + accessible name. Stable when the name is unique. 3. **`getByLabel(text)`** — form controls by their label. 4. **`getByText(text)`** — unique visible text. 5. **`locator(css)`** — raw CSS. Last resort. ```js // good — resilient click(getByRole("button", { name: "Save changes" })); fill(getByLabel("Email"), TEST_USER_EMAIL); // avoid — brittle click(locator(".btn.btn-primary.css-1a2b3c")); ``` ## Refine, don't index blindly Narrow with `filter()` before reaching for position: ```js click(getByRole("row").filter({ hasText: "Uma Quinn" }).first()); ``` `nth()` / index-only refinement is fragile — the linter flags it. ## The linter enforces it The [linter](/reference/linter/) warns on deep CSS, XPath, hashed class names (Tailwind JIT, CSS Modules), and unexplained `pause()`. Run it anytime: ```sh npx @unotest/web lint ``` --- # Variables & secrets > Keep credentials out of code and run the same test across environments. Environments, tokens and credentials live in variables — not hardcoded in scenarios. ## Two files - `unotest/.env` — ordinary variables. - `unotest/.secrets` — secrets, git-ignored. Reference them by **bare `UPPER_SNAKE` identifiers** in scenarios: ```js goto(APP_BASE_URL + "/login"); fill(getByLabel("Email"), TEST_USER_EMAIL); fill(getByLabel("Password"), TEST_PASSWORD); ``` :::caution[No mustache] Don't write `"{{VAR}}"` string literals — reference the bare identifier. The linter errors on mustache in the DSL. ::: ## Secrets are masked Values registered as secrets are redacted in logs and failure artifacts (shown as `‹secret:NAME›`), so they never leak into a trace or a screenshot bundle. ## Across environments Because the test references variables, the same scenario runs against dev, staging or prod — just change the values. The [viewer](/viewer/overview/)'s **Variables** panel lets you edit values, reveal/hide secrets, and toggle boolean flags without leaving the window. --- # Auth & cached login > Reuse a logged-in session instead of signing in at the start of every test. Signing in at the top of every scenario is slow and brittle. Cache a logged-in session once and reuse it. ## storageState Point your config at a Playwright `storageState.json`: ```js // unotest.config.mjs export default { storageState: "unotest/.auth/state.json", }; ``` Scenarios then start already authenticated — no `flow_signin` per test. ## Seeding the state Create the state once (an agent flow, or a small setup scenario that signs in and captures cookies/localStorage). Keep the file git-ignored and refresh it when it expires. ## When to still sign in - Tests that specifically exercise the **login flow** itself. - Tests that need a **different user** than the cached one — use [`flow_signin`](/guides/flows-and-data/) for those. :::tip Pair cached auth with [variables](/concepts/variables/) so the same suite runs against dev / staging / prod with the right credentials. ::: --- # Write a test by hand > Author a scenario yourself — the format is just readable JavaScript. You never *have* to let the agent write everything. Scenarios are plain `.js`, so you can author or edit them by hand. ## The shape ```js // unotest/e2e/search.js function test_search() { step("Search for a known product", () => { goto("/"); fill(getByRole("searchbox"), "wireless mouse"); press(getByRole("searchbox"), "Enter"); }); step("Results contain the product", () => { assertVisible(getByRole("link", { name: /wireless mouse/i })); }); } ``` ## Rules to keep in mind - Put every executable step inside `step("intent", () => { … })`. - Prefer [stable selectors](/concepts/selectors/). - Reference [variables](/concepts/variables/) by bare `UPPER_SNAKE`. - Scenarios live in a feature subfolder of `unotest/e2e/`, not its root. ## Lint as you go ```sh npx @unotest/web lint ``` The [linter](/reference/linter/) catches brittle selectors, missing step wrappers, and unexplained `pause()`. See the full [DSL reference](/reference/dsl/). --- # Run in CI > Run unotest scenarios and collections in continuous integration. Because tests are plain `.js` in your repo with no proprietary format, CI is straightforward. ## A minimal job ```yaml # .github/workflows/e2e.yml jobs: e2e: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: { node-version: 20 } - run: npx @unotest/web install-chromium - run: npx @unotest/web collection smoke --workers=4 env: APP_BASE_URL: ${{ secrets.APP_BASE_URL }} TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }} TEST_PASSWORD: ${{ secrets.TEST_PASSWORD }} ``` ## Tips - **Browsers:** run the bundled Chromium in CI; you can widen `browsers` to Firefox/WebKit in your [config](/reference/config/). - **Retries:** turn on `retry.count` for CI (off in dev) to absorb transient flakiness — assertion failures still never retry. - **Secrets:** inject [variables](/concepts/variables/) from your CI secret store; the runner masks them in logs and artifacts. - **Split:** `smoke` on every push, `regress` nightly. - **iOS:** `unotest mobile` needs macOS runners with Xcode. ## Failure artifacts On failure the [bundle](/concepts/failure-bundles/) (screenshot, console, semantic DOM, trace) is written under `.unotest/failures/` — upload it as a CI artifact for inspection. --- # Run smoke / regress > Group scenarios into collections and run them as a set. Group related scenarios into a [collection](/concepts/collections/) and run them together. ## Create one In the [viewer](/viewer/overview/): **Collections** (⌘3) → create → add scenarios → reorder by drag. A collection is a small YAML file under `unotest/e2e/collections/`. ## Run it ```sh # fast key flows npx @unotest/web collection smoke --workers=4 # full sweep, stop on first failure npx @unotest/web collection regress --bail ``` During a run you get per-scenario status and a progress bar (in the viewer), or a serial/parallel summary on the CLI. ## A good split - **smoke** — login, checkout, the 5 flows that must never break. Run on every push. - **regress** — everything. Run nightly or before release. Wire it into CI: [Run in CI](/guides/ci/). --- # Debug a failing test > Use the pause-on-failure debugger — in the viewer or with your agent. A failing step freezes the run so you can inspect it, instead of dumping a stack trace. ## In the viewer 1. Open the scenario, click a gutter dot to set a **breakpoint**. 2. Click **Debug** (headed + pause on breakpoints and failure). 3. When it pauses, inspect **Vars / Call stack / Trace**, then **Continue / Step / Stop**. See [Step debugger](/viewer/debugger/). ## From the CLI ```sh npx @unotest/web e2e checkout --debug # or target a line directly npx @unotest/web e2e checkout --break 24:5 ``` ## With your agent Ask the agent to fix the failure. It calls `inspect_runtime` at the pause point, reads the [failure bundle](/concepts/failure-bundles/), and proposes a diff via `agent_fix`. **You review and commit** — nothing is applied silently. ## Common fixes | Symptom | Likely fix | | --- | --- | | Locator timed out | selector drifted → [stable selector](/concepts/selectors/) | | Element found but not actionable | add a `waitFor` | | Expected ≠ actual | update the assertion | For full diagnostics (JSONL log + artifacts), set `UNOTEST_DEBUG=1`. --- # Write your first test > Let your agent write and verify a real E2E test, step by step. This is the canonical loop: describe a flow, let the agent build it, review. ## 1. Make sure the project is set up ```sh npx @unotest/web init ``` ## 2. Describe the flow to your agent In your AI editor, be specific about the start, the actions, and the assertion: > Open `/login`, sign in with `TEST_USER_EMAIL` / `TEST_PASSWORD`, and check that > a heading "Dashboard" is visible. The agent will explore your live app over MCP, then write a scenario. ## 3. Let it write & run The agent records actions, generates `unotest/e2e/login.js`, and runs it. If a step fails, it pauses, inspects, patches and resumes — then shows you the result. ## 4. Read the result ```js function test_login() { step("Sign in with the demo user", () => { goto("/login"); fill(getByLabel("Email"), TEST_USER_EMAIL); fill(getByLabel("Password"), TEST_PASSWORD); click(getByRole("button", { name: "Continue" })); }); step("Dashboard is shown", () => { assertVisible(getByRole("heading", { name: "Dashboard" })); }); } ``` ## 5. Commit It's ordinary `.js` in your repo. Tweak the labels or selectors if you like, then commit it like any code. :::tip Keep `TEST_USER_EMAIL` / `TEST_PASSWORD` in [variables](/concepts/variables/), not in the test. ::: --- # Reuse flows & seed data > Factor repeated journeys into flows and put the app into a known state with mocks. Keep scenarios short and reliable by extracting repetition into [helpers](/concepts/helpers/). ## Extract a flow Move a repeated journey into a `flow_*` helper: ```js // _helpers/flows.js function flow_signin(email, password) { goto("/login"); fill(getByLabel("Email"), email); fill(getByLabel("Password"), password); click(getByRole("button", { name: "Sign in" })); } ``` Call it from a scenario: ```js function test_orders() { step("Sign in", () => { flow_signin(TEST_USER_EMAIL, TEST_PASSWORD); }); step("Orders page loads", () => { assertVisible(getByRole("heading", { name: "Your orders" })); }); } ``` ## Seed data with mocks Put the backend into a known state before the test, and clean up after: ```js // _helpers/mocks.js function seed_order(userId) { dbExec("INSERT INTO orders (uid, status) VALUES ($1, 'paid')", userId); } ``` `dbQuery` / `dbExec` use the `database` URL from config; `apiCall` uses `apiBaseUrl`; `shell` runs a binary. These are pinned in [config](/reference/config/) — scenarios can't point them elsewhere. :::tip[Cache login instead of repeating it] For auth specifically, prefer a cached session over running `flow_signin` in every test — see [Auth & cached login](/guides/auth/). ::: --- # Multi-tab & iframes > Drive multiple tabs and work inside nested iframes. unotest scenarios can span tabs and reach into frames. ## Switch tabs When an action opens a new tab, switch to it by index (0-based): ```js step("Open the invoice in a new tab", () => { click(getByRole("link", { name: "View invoice" })); setPage(1); // focus the new tab assertVisible(getByText("Invoice #")); }); ``` Inspect tabs with the `list_pages` / `get_active_context` MCP tools while exploring. :::note[Waiting for a new tab] `waitForPage()` isn't shipped yet. For a tab that opens asynchronously, use `pause(ms)` with a `// reason:` comment until it lands. ::: ## Work inside an iframe Enter a frame to scope subsequent calls; exit when done: ```js step("Submit the embedded payment form", () => { enterFrame(getByTestId("payment-iframe")); fill(getByLabel("Card number"), TEST_CARD); click(getByRole("button", { name: "Pay" })); exitFrame(); }); ``` Or resolve a single element across the boundary with `contentFrame()`. See the [DSL reference](/reference/dsl/). --- # Step debugger > Gutter breakpoints, pause-on-failure, and live inspection. The viewer's debugger is the same pause-on-failure machinery the agent uses — exposed visually. ## Breakpoints Click the gutter next to any step to toggle a breakpoint. Breakpoints persist to `unotest/.debugger.json`, so the CLI and agent see them too. ## Run paused Click **Debug** (or `e2e --debug`) to run with a visible browser that pauses on breakpoints and on failure. A paused run shows the reason (breakpoint / manual / failure) and the exact line, highlighted in the gutter. ## Controls - **Continue** — run to the next breakpoint or the end. - **Step** — execute the next statement. - **Stop** — abort the run. ## Inspect While paused, the inspector offers: | Tab | Shows | | --- | --- | | **Vars** | live variables in scope (truncated to 4 KB each) | | **Call stack** | your function frames, outer → inner (`file:line`) | | **Trace** | the full event log since the run started | | **Breakpoints** | every breakpoint, with add/remove | This is the human side of [self-healing](/concepts/debugging/): you can take over the same paused state the agent works from. --- # Inspector & artifacts > The failure evidence shown per run — screenshot, semantic DOM, console, trace. When a run fails, the inspector shows the [failure bundle](/concepts/failure-bundles/) for that run. | Tab | Contents | | --- | --- | | **Screenshot** | the page at the moment of failure (PNG) | | **Semantic DOM** | a readable outline of the page structure at failure | | **Console** | browser console output, with level (log / warn / error) | | **Trace** | the full DSL execution trace | | **Network** | placeholder — HAR capture isn't wired yet | Artifacts are served from the run directory under `unotest/.runs/` / `.unotest/failures/`. Past runs keep their bundles (subject to retention), so you can reopen any historical failure from the **Runs** tab. :::note Network HAR and video aren't captured yet. Use the screenshot + console + semantic DOM + trace. ::: --- # Overview & launch > A local IDE-style viewer for your tests — no cloud, no account. The viewer is your home base: browse scenarios, run them, watch each step live, and debug failures — all locally. ## Launch ```sh npx @unotest/web viewer ``` It starts a localhost HTTP + WebSocket server and opens your browser. Set `UNOTEST_VIEWER_NO_OPEN=1` to start without opening a tab. Your agent can also launch it via the `open_viewer` MCP tool. ## What it is - **Local-only.** No cloud, no account. It reads run artifacts from `unotest/.runs/` and your scenarios/helpers/collections from disk. - **Read + run.** Browse and run; results stream live over WebSocket. - **Single source of truth.** Breakpoints and variables are files on disk, so the CLI, the agent and the viewer all agree. ## Next - [Tour](/viewer/tour/) — the activity bar and panels - [Running tests](/viewer/running/) - [Step debugger](/viewer/debugger/) --- # Running tests > Run from the UI and watch results stream live, step by step. ## Run a scenario Open a scenario and click **Run** (headless) or **Debug** (visible browser + pause on breakpoints and failure). A run tab opens immediately and updates live. ## Live streaming Results stream over WebSocket as each step executes — no waiting for the whole run to finish. Step status updates in place: pending → running → passed/failed. ## Block view The scenario renders as a **block view**: each `step("…")` is a foldable group, each DSL call a row with a status icon, line number and duration. Fold a step to see intent; unfold to see the calls. ## Error cards When a step fails, an inline **error card** pins to the offending line — the error class, message, and the `file:line` source snippet — so you see exactly what broke without scrolling away. ## Collections Run a whole [collection](/concepts/collections/) from its view. You get a per-scenario status list, a progress bar, and a one-click **abort** that stops all in-flight scenarios. --- # Shortcuts & themes > Keyboard shortcuts and theme switching in the viewer. ## Keyboard shortcuts | Shortcut | Action | | --- | --- | | `⌘1` … `⌘6` | switch activity-bar section (Scenarios … Variables) | | `⌘K` | focus the tree search/filter | | `⌘W` | close the active tab | | `⌃\`` | toggle the docked terminal | | `Esc` | blur search / cancel a drag-reorder | ## Themes Light and dark, toggled from the activity bar (bottom). The viewer follows your system preference on first load and remembers your choice. Every surface — block view, debugger, terminal — adapts. --- # Terminal (AI inside) > A docked terminal with local, Claude and Codex sessions — drive the agent without leaving the viewer. The viewer has a docked terminal (toggle ⌃\`) so you can drive the CLI — and your AI agent — without leaving the window. ## Sessions | Session | What it is | | --- | --- | | **local** | a native shell on your machine | | **claude** | a Claude Code session, in the viewer | | **codex** | a Codex session | Open multiple tabs, switch between them, clear the buffer, and resize the dock. Each session is its own PTY over a dedicated WebSocket. ## Why it matters Ask the agent to write or fix a test right here — then watch it run in the same window. The whole loop (author → run → inspect → repair) stays in one place. --- # Tour > The activity bar, document tabs, and inspector panel. The viewer is a three-part IDE: an **activity bar** (left), **document tabs** (center), and an **inspector** (right). ## Activity bar Six sections, switchable with ⌘1–⌘6: | Section | ⌘ | Shows | | --- | --- | --- | | **Scenarios** | ⌘1 | scenario files with last-run status | | **Helpers** | ⌘2 | helper files (read-only) | | **Collections** | ⌘3 | collections — create / rename / reorder / run | | **Runs** | ⌘4 | history of every run, filterable by status | | **Active** | ⌘5 | live, in-progress runs | | **Variables** | ⌘6 | edit `.env` / `.secrets`, reveal secrets, toggle flags | At the bottom: a terminal toggle, a theme toggle, and the version. ## Document tabs Open scenarios, helpers, collections and runs as tabs. ⌘W closes the active tab; right-click for Close / Close Others / Close All. Tabs and layout persist across reloads. ## Inspector (right) Context-aware: during a **live run** it shows Vars / Call stack / Trace / Breakpoints; for a **failed run** it shows the artifact tabs (screenshot, semantic DOM, console, trace). See [Inspector & artifacts](/viewer/inspector/). --- # CLI — mobile > Every npx @unotest/mobile subcommand (iOS). The `unotest-mobile` CLI (iOS, macOS only). Run with `npx @unotest/mobile `. ## `install` Install a built .app on the Simulator; auto-detect bundle ID, URL scheme and permissions; optionally bootstrap unotest/ and .mcp.json. ```sh npx @unotest/mobile install [--update-env] ``` - `--update-env` — Persist the detected app/simulator config to the environment. **Example** ```sh npx @unotest/mobile install ./ios/build/.../MyApp.app --update-env ``` ## `init` Bootstrap unotest/ + .mcp.json without installing an app. ```sh npx @unotest/mobile init ``` ## `doctor` Re-check the environment: Xcode, Simulator, Node, WebDriverAgent cache. ```sh npx @unotest/mobile doctor ``` ## `e2e` Run `unotest/e2e/.js` against the Simulator. ```sh npx @unotest/mobile e2e [--quiet] ``` - `--quiet` — Suppress per-step trace output. ## `lint` Static-check every scenario and helper. ```sh npx @unotest/mobile lint ``` ## `mcp` Run as an MCP stdio server (what the editor launches). ```sh npx @unotest/mobile ``` --- # CLI — web > Every npx @unotest/web subcommand, with flags and examples. The `unotest-web` CLI. Run any command with `npx @unotest/web ` — works with npm, pnpm, yarn or bun. ## `init` Bootstrap the unotest/ layout, config, and MCP wiring for your editor. Re-run anytime — it never overwrites your edits. ```sh npx @unotest/web init [target] [--browser system|bundled|none] ``` - `--browser` — Skip the interactive browser prompt: use system Chrome/Edge, bundled Chromium, or none. **Example** ```sh npx @unotest/web init --browser system ``` ## `e2e` Run a single scenario. `name` resolves to `unotest/e2e/.js`. ```sh npx @unotest/web e2e [--debug] [--break line:col] ``` - `--debug` — Pause on breakpoints (.debugger.json) and on failure. - `--break line:col[,line:col]` — Override breakpoints (comma-separated, no spaces). **Example** ```sh npx @unotest/web e2e auth/login --debug ``` ## `collection` Run a collection (a named set of scenarios), e.g. smoke or regress. ```sh npx @unotest/web collection [--workers=N] [--bail] [--headed] ``` - `--workers=N` — Run N scenarios in parallel (default: serial). - `--bail` — Stop after the first failure. - `--headed` — Run with a visible browser. **Example** ```sh npx @unotest/web collection smoke --workers=4 ``` ## `viewer` Open the local IDE-style viewer (HTTP + WebSocket). No cloud, no account. ```sh npx @unotest/web viewer ``` - `UNOTEST_VIEWER_NO_OPEN=1` — Env: start the server without auto-opening a browser. **Example** ```sh npx @unotest/web viewer ``` ## `lint` Static-check scenarios and helpers. Exits non-zero on errors. ```sh npx @unotest/web lint [paths...] ``` **Example** ```sh npx @unotest/web lint ``` ## `install-chromium` Download Playwright’s bundled Chromium (~150 MB). Not needed if you chose system Chrome/Edge. ```sh npx @unotest/web install-chromium ``` ## `mcp` Run as an MCP stdio server. This is what your editor launches automatically; you rarely run it by hand. ```sh npx @unotest/web mcp ``` --- # Configuration > Every field in unotest.config.{js,mjs,ts}. Configuration lives in `unotest.config.{js,mjs,ts}` at your project root, auto-discovered by the CLI and MCP server. | Field | Type | Default | Description | | --- | --- | --- | --- | | `baseUrl` | `string (URL)` | `—` | Base for relative goto() paths. If unset, scenarios use absolute URLs. | | `browsers` | `("chromium" / "firefox" / "webkit")[]` | `["chromium"]` | Browser families to run. Single in dev; CI can run all three. | | `channel` | `"chrome" / "msedge" / "chrome-beta" / null` | `null` | For chromium: use a system browser (zero download) or bundled Chromium (null). | | `viewport` | `{ width, height }` | `{ 1280, 720 }` | Browser viewport size. | | `retry.count` | `int 0–10` | `0` | Retries for transient failures. Off in dev; 1+ in CI. | | `retry.on` | `("transient" / "network" / "crash")[]` | `["transient"]` | Which failure classes retry. Assertion failures never retry. | | `failureBundle.tier1` | `true (locked)` | `true` | Always on: error + console + semantic DOM snapshot + DSL trace. | | `failureBundle.tier2` | `boolean` | `true` | Screenshots (viewport + element-focused). | | `failureBundle.tier3` | `{ network, video }` | `{ false, false }` | HAR + video. Not wired yet — leave off. | | `failureBundle.retention` | `{ runs, days }` | `{ 20, 7 }` | Keep N recent runs; delete older than D days. | | `failureBundle.storageDir` | `string` | `.unotest/failures` | Where failure bundles are written. | | `dialogPolicy` | `"accept" / "dismiss" / "manual"` | `"accept"` | How native dialogs (alert/confirm) are handled. | | `storageState` | `string (path)` | `—` | Playwright storageState.json for cached login. | | `defaultTimeoutMs` | `int` | `3000` | Action / locator-resolution timeout. Override per call. | | `defaultNavigationTimeoutMs` | `int` | `15000` | Navigation timeout (goto/reload/waitForUrl). | | `testDir` | `string` | `unotest/e2e` | Scenario directory. | | `helpersDir` | `string` | `unotest/e2e/_helpers` | Helper functions directory. | | `sandbox.shellCwd` | `string` | `cwd` | Working directory for shell(). | | `sandbox.database` | `string (URL)` | `—` | Connection string for dbQuery/dbExec (postgres://, mysql://, sqlite:). | | `sandbox.apiBaseUrl` | `string (URL)` | `—` | Base URL for apiCall(method, path). | | `linter.enabled` | `boolean` | `true` | Enable the scenario linter. | | `mcp.transport` | `"stdio"` | `"stdio"` | MCP server transport (stdio in the current release). | --- # DSL reference > Every function available inside a scenario, grouped — the unotest web vocabulary. Scenarios are plain `.js` on a sandboxed engine. The vocabulary mirrors Playwright, so it reads the way you expect. Every executable step lives inside `step("intent", () => { ... })`. :::tip[Selector priority] getByTestId → getByRole(name) → getByLabel → getByText → locator(css) ::: ## Navigation Drive the page: load URLs, go back/forward, and wait for state. - `goto(url, { waitUntil?, timeout? })` — Navigate to a URL. `waitUntil`: 'load' | 'domcontentloaded' | 'networkidle'. Relative paths resolve against `baseUrl`. - `reload({ waitUntil?, timeout? })` — Reload the current page. - `goBack({ timeout? })` — Navigate back in history. - `goForward({ timeout? })` — Navigate forward in history. - `waitForUrl(pattern, { timeout? })` — Wait until the URL contains `pattern` (substring match). - `waitForNavigation({ timeout? })` — Wait for the next navigation event. - `waitFor(locator, { state?, timeout? })` — Wait for an element state: 'attached' | 'visible' | 'hidden'. - `waitForText(text, { timeout? })` — Wait until `text` appears anywhere on the page. - `pause(ms)` — Explicit delay. Discouraged — the linter wants a `// reason:` comment. Prefer a `waitFor*`. ## Locators Build a locator. Prefer the most stable matcher available — see Stable selectors. - `getByTestId(id)` — Match by `data-testid`. Most stable — prefer this. - `getByRole(role, { name?, exact? })` — Match by ARIA role + accessible name. `name` accepts a string or `/regex/`. - `getByLabel(text, { exact? })` — Match a form control by its associated label. - `getByText(text, { exact? })` — Match by visible text content. - `getByPlaceholder(text, { exact? })` — Match an input by its placeholder. - `getByAltText(text, { exact? })` — Match an image by its `alt` text. - `getByTitle(text, { exact? })` — Match by `title` attribute. - `locator(css)` — Match by raw CSS selector. Last resort — the linter warns on deep/brittle CSS. ## Chain refiners Narrow a locator. Chains desugar to free calls: `getByRole(...).filter(...).first()`. - `filter(locator, { hasText?, hasNotText?, has?, hasNot? })` — Keep matches by contained text or a nested child locator. - `first(locator)` — First match. - `last(locator)` — Last match. - `nth(locator, index)` — The N-th match (0-indexed). Index-only refinement is fragile (linter info). - `contentFrame(locator)` — Resolve an `