Playwright Test Agents: Planner, Generator and Healer Guide

Playwright test agents are AI helpers that plan, generate, and repair tests automatically, reducing manual test creation and maintenance.

Vishwas Tiwari

Feb 28, 2026

If you have worked with end-to-end tests long enough, you know the real cost is not writing the first test. It is maintaining the next hundred. A small UI change breaks selectors. CI turns red. Instead of shipping features, you fix tests that were passing yesterday.

That is why Playwright v1.56 introduced Playwright test agents in October 2025. The goal is simple - reduce the manual work involved in planning, writing, and maintaining Playwright tests by letting AI handle the repetitive parts while you stay in control.

In this guide, you will learn:

What Playwright test agents actually are
How they work under the hood
How to set them up in your project
How to use them safely in CI
Where they help, and where human review still matters

If maintaining your test suite takes more time than building it, this is worth your attention.

What are Playwright test agents

Playwright test agents are AI driven helpers built into Playwright starting from v1.56. They assist with planning test scenarios, generating Playwright test code, and repairing broken tests by interacting with a real browser session.

Instead of relying only on manual scripting and maintenance, teams can use these agents to handle structured exploration, code creation, and test repair based on live application behavior.

There are three agents, each responsible for a different stage of the testing lifecycle:

Planner - explores the application and creates structured test plans
Generator - converts test plans into executable Playwright test files
Healer - detects and fixes failing tests caused by UI or locator changes

Planner vs Generator vs Healer

Agent	Primary Role	Input	Output	Best Used For
Planner	Scenario discovery and planning	Seed test + running app	Markdown test plan	New features, coverage mapping
Generator	Test code creation	Markdown test plan	Playwright .spec.ts files	Building automation quickly
Healer	Test maintenance and repair	Existing failing test suite	Updated and stabilized test files	UI changes, locator drift

Together, these agents introduce structured automation across planning, authoring, and maintenance, while keeping your standard Playwright setup unchanged.

Why Playwright introduced Planner, Generator, and Healer

Modern web applications change constantly. UI components get refactored. Class names change. Layouts shift. Most of the time, the product still works, but the tests do not. As a result, teams often spend more time fixing broken test automation than validating new functionality.

This creates three consistent pain points in test automation:

Test planning is manual and time-consuming
Building large, reliable test suites takes significant effort
Test maintenance becomes a continuous burden after every release

Over time, the testing workflow turns into a repetitive cycle:

Plan → Write → Fix

Playwright introduced Planner, Generator, and Healer to reduce that repetition. Instead of engineers handling every stage manually, parts of that loop can now be assisted or automated:

Agent plans → Agent writes → Agent fixes

The goal is not to remove human oversight. It is to reduce the time spent on repetitive test work so teams can focus on real defects, coverage gaps, and product quality.

How do Playwright test agents work?

Playwright test agents use the Model Context Protocol (MCP) to connect a large language model with a real browser. The AI does not guess what the page looks like. It interacts with the actual application, observes live DOM state, and makes decisions based on real behavior.

Here is the high-level flow:

Planner explores the app using a real browser session
Planner writes a markdown test plan with scenarios, steps, and assertions
Generator reads the plan and produces Playwright test files
Tests run in CI like any standard Playwright suite
Healer detects and fixes broken tests automatically

Three layers work together to make this happen.

Playwright Engine handles the browser automation through the Chrome DevTools Protocol. This is the same foundation that powers every standard Playwright test.

LLM Layer uses a large language model (GPT, Claude, or similar) to interpret DOM structure, page routes, and application behavior. The model receives structured snapshots rather than raw screenshots, which keeps it accurate and token-efficient.

Orchestration Loop coordinates the exchange between the engine and the LLM. It sends page context to the model, receives instructions back, executes browser actions, and repeats until the task is complete.

This is what separates playwright test agents from generic AI code generators. A code generation tool predicts what your page might look like. Playwright test agents interact with what your page actually does.

How can AI plan, generate, and heal Playwright tests automatically?

The Planner explores your live application through a real browser, discovers user flows and edge cases, and produces structured markdown test plans. The Generator reads those plans, opens the application, verifies selectors against the real DOM, and writes test files with stable locators and assertions. The Healer fixes broken tests by analyzing failure traces, identifying root causes, and applying targeted code changes at runtime.

Let's look at each in detail.

How the Planner agent discovers test scenarios

The Planner does not ask you to list every test case upfront. It explores your application the way a QA engineer would during an exploratory session, except it does it systematically and documents everything as it goes.

The process works like this:

Planner runs your seed test (tests/seed.spec.ts) to set up the base environment - authentication, initial navigation, and test data
It opens the application in a real browser and begins navigating through pages and user flows
At each step, it inspects the DOM to identify interactive elements, forms, navigation links, and key UI components
It maps out user journeys - happy paths, error states, boundary conditions, and edge cases
It writes a structured markdown test plan in the specs/ folder, with scenarios, steps, expected results, and assertions
Each scenario in the plan is detailed enough for the Generator to convert directly into executable test code

The output is not a vague list of ideas. It is a precise, step-by-step specification that covers what to test, how to test it, and what the expected outcome should be. Teams looking for a quick reference on Playwright syntax can also pair this with the Playwright cheatsheet to review locator patterns and assertion strategies.

For example, if the Planner explores an e-commerce checkout flow, it does not just write "test checkout." It produces scenarios like "guest user adds item to cart, proceeds to checkout, enters shipping details, and sees order confirmation," along with edge cases like "user submits checkout with an expired credit card and sees a validation error."

The key advantage here is coverage. A human tester might focus on the obvious paths and miss less common flows. The Planner systematically works through the application's UI, identifying scenarios that a manual approach might overlook. It also structures the plan in a consistent format, which means the Generator can process it without ambiguity.

How the Generator agent creates tests

When the Generator receives a spec file, it does not produce code from a template. It opens your application in a real browser and validates every step.

The process works like this:

Generator reads a spec file (for example, specs/checkout-flow.md)
It launches the app using your seed test as the base
For each scenario, it navigates to the correct page and inspects the DOM
It selects locators using Playwright's preferred strategies - role-based, text-based, and test-id selectors
It writes test code with proper assertions, waits, and error handling
Each output file maps one-to-one with a scenario in the spec

The result is code that reads like it was written by a senior SDET. Not brittle CSS selectors. Not XPath chains that break when someone moves a div. Actual production-grade locators. For teams exploring other AI test generation tools, this approach stands out because the output is validated against a live application, not predicted from static code.

One documented case showed a team generating 82 end-to-end tests for an e-commerce application using the Playwright Skill with Claude Code. Product browsing, cart operations, checkout flows. All from a structured plan, all validated against the live app.

How the Healer agent fixes tests

The Healer is where teams with large existing suites get the most value. Here is what happens when a test fails:

The Healer runs the failing test in debug mode
It checks console logs, network requests, and page snapshots at the failure point
It performs root cause analysis: is this a selector issue, a timing problem, or an actual application bug?
If the test is the problem, the Healer updates the code. It picks better selectors, adjusts waits, or modifies assertions
It re-runs the test to confirm the fix works
If the application itself is broken (not the test), it marks the test as skipped

That last point is important. The Healer does not patch around real bugs. If a checkout button genuinely stopped working, the Healer flags it instead of rewriting the test to ignore the failure. You still know something is wrong. You just don't waste time assuming it's a test problem.

A quick example

Say your e-commerce site's checkout button CSS class changes from .btn-checkout to .btn-primary-checkout after a frontend refactor. In a traditional setup, every test clicking that button breaks. Someone has to find the affected tests, update selectors, and re-run the suite.

With the Healer, the process looks different. It detects the failure, inspects the page, sees the button text and ARIA role haven't changed, switches to page.getByRole('button', { name: 'Checkout' }), updates the test file, and confirms the test passes. No developer time spent. No Jira ticket. No "can someone look at this flaky test in standup." It just gets fixed.

Setting up Playwright test agents

Getting started requires Playwright v1.56 or later and a compatible AI tool. The setup takes about five minutes.

Step 1 - Install the latest Playwright

terminal

npm install -D @playwright/test@latest
npx playwright install chromium

Step 2 - Initialize the agents

Run the init command with your preferred AI loop. Playwright supports VS Code (with Copilot), Claude, and OpenCode:

terminal

# For VS Code with Copilot
npx playwright init-agents --loop=vscode


# For Claude Code
npx playwright init-agents --loop=claude


# For OpenCode
npx playwright init-agents --loop=opencode

This generates agent definition files and a seed test. The definitions are markdown-based configuration files stored in your .github/ folder. They describe each agent's behavior, instructions, and available tools.

Note: VS Code v1.105 or later is required for the agentic experience to work in VS Code.

Step 3 - Configure your seed test

The seed test (tests/seed.spec.ts) is the starting point for all agent activity. It sets up the base environment - authentication, test data, navigation.

tests/seed.spec.ts

import { test } from '@playwright/test';


test('seed', async ({ page }) => {
  await page.goto('https://your-app.com');
  // Add login or setup logic here
});

The Planner runs this seed test before it starts exploring. If your app needs authentication, add the login flow here. Everything the agents do builds on this starting point.

Step 4 - Run the Planner

Open your AI tool's chat, select planner mode, and prompt it:

terminal

Explore the app and generate a test plan for user
registration and checkout flows. Use seed.spec.ts as base.

The Planner navigates your app, discovers UI elements and user flows, and produces a markdown file in the specs/ folder with scenarios, steps, expected results, and edge cases.

Step 5 - Generate tests

Switch to generator mode and point it to the plan:

terminal

Use the test plan in specs/checkout-flow.md to generate
Playwright tests. Save them in tests/checkout/

The Generator reads the spec, opens the live app, verifies selectors, and writes Playwright test scripts mapped to each scenario.

Step 6 - Heal and validate

Run the Healer against your new or existing suite:

terminal

Run the playwright test healer on the test suite in /tests.
Fix any failing tests and verify your fixes.

The Healer executes tests, finds failures, applies fixes, and re-runs until everything passes or it flags genuinely broken functionality.

Project structure after setup

code

repo/
├── .github/           # Agent definitions (planner, generator, healer)
├── specs/             # Markdown test plans
│  └── checkout-flow.md
├── tests/
│  ├── seed.spec.ts   # Base environment setup
│  └── checkout/      # Generated test files
        ├── guest-checkout.spec.ts
        └── registered-checkout.spec.ts
└── playwright.config.ts

Regenerate agent definitions whenever you update Playwright. Run npx playwright init-agents again to pick up new tools and instructions.

Using Playwright test agents in CI/CD

The agents themselves are interactive tools, designed for use through VS Code Copilot, Claude Code, or OpenCode. But the tests they produce are standard Playwright tests. Your CI pipeline runs them the same way it runs any other Playwright suite.

.github/workflows/playwright.yml

# .github/workflows/playwright.yml
name: Playwright Tests
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 18
      - run: npm ci
      - run: npx playwright install --with-deps
      - run: npx playwright test
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: playwright-report
          path: playwright-report/

The if: always() on artifact upload is critical. Without it, failed test reports do not get saved, and those are exactly the reports you need.

Tracking stability of agent-generated tests

Here is the thing most teams overlook. Running agent-generated tests in CI is easy. Knowing whether those tests are actually stable across builds is a different problem.

When you're producing tests with the Generator and repairing them with the Healer, you need answers to questions that raw CI logs cannot provide:

Which tests were healed and how often do they break again?
Are healing events increasing or decreasing over time?
Is a failure a new regression or the same flaky test from last week?

This is where a reporting layer becomes necessary. TestDino tracks test stability patterns across CI runs. It classifies failures into categories - actual bug, flaky test, or UI change - and gives you historical context for every failure. If the Healer fixed a test last Tuesday and it broke again on Thursday, you can see that pattern immediately instead of re-investigating from scratch.

For teams using playwright test agents at scale, that kind of visibility is the difference between trusting your suite and guessing. Test analytics help you understand trends over time. Without it, you are generating tests faster than you can verify whether they are actually reliable.

The workflow that works

The strongest setup looks like this:

Planner discovers scenarios and writes specs
Generator creates test files from specs
Tests run in CI on every push
Failures get classified and tracked in a reporting tool
Healer runs periodically to fix locator drift and unstable tests
Reporting confirms whether healed tests stay stable or keep breaking

That feedback loop is what turns playwright test agents from a cool experiment into a reliable part of your pipeline.

Limitations you should know

Playwright test agents are useful. They are not perfect. Being honest about the limits helps you use them well.

Selectors are not always right. The AI picks good locators most of the time, but it can still choose unstable ones. A text locator works great until someone changes button copy. A role-based locator breaks if ARIA roles are missing. Always review generated code before merging. The Playwright Trace Viewer can help you inspect exactly what the agent saw during test execution.

Complex UI changes need a human. If a redesign changes the entire user flow, not just a selector, the Healer cannot redesign the test. It fixes locators. It does not rewrite test logic. That is still your job.

TypeScript and JavaScript only. Playwright test agents currently support the JS/TS test runner. Teams using Playwright for Python do not have official agent support yet, though it is a requested feature on GitHub.

Over-reliance is a real risk. When tests write and fix themselves, there is a temptation to stop reviewing them. Do not do that. AI-generated tests should go through the same code review process as anything else in your codebase.

Agents work best on stable base environments. If your seed test is flaky, if auth breaks intermittently, or if the test environment is unreliable, the Planner produces weak plans and the Generator writes fragile tests. Garbage in, garbage out. If your suite is running slow, that compounds the problem. This is the most common reason teams have a bad first experience with playwright test agents. Fix the foundation first. Then let the agents do their work.

Best practices for production teams

These are practical patterns that help teams get consistent results from playwright test agents.

Start with a solid seed test. Your seed test is the foundation. If it does not reliably set up the right environment, nothing the agents produce will be reliable either. Spend time getting authentication, test data, and navigation right before asking agents to explore.

Keep specs in version control. Treat markdown test plans like documentation. Review them in pull requests. Good specs produce good tests. Bad specs produce tests you will rewrite manually anyway.

Add data-testid attributes to critical elements. The Generator prefers test IDs when they exist. Adding them to key buttons, forms, and navigation elements gives the agent better options and produces more stable tests.

Run the Healer on a schedule. Do not wait for CI to break. Set up a weekly Healer run against your full suite. Catching small locator drift early is cheaper than debugging a wall of failures after a major release.

Pair agent output with a reporting tool. Agents create and fix tests, but they don't show you the big picture. A tool like TestDino gives you build-level visibility into test stability, flaky test trends, and failure classification. When the Healer fixes a test, your reporting tool confirms whether that fix actually held or just delayed the problem.

Regenerate agent definitions after every Playwright update. When you upgrade Playwright, re-run npx playwright init-agents to get updated tools and instructions. Stale definitions mean your agents miss improvements and may not work correctly with newer Playwright features.

Conclusion

Playwright test agents bring real automation to the parts of testing that have always been manual. The Planner discovers what to test, the Generator writes the code, and the Healer keeps it working. Together, they cut the time teams spend on test creation and maintenance without removing human oversight where it matters.

That said, generating and fixing tests is only half the problem. Knowing whether those tests stay stable across builds is what separates a reliable suite from one that quietly rots. TestDino fills that gap by tracking test stability, classifying failures, and giving you historical context on every flaky or healed test in your pipeline.

If your team spends more time fixing tests than writing features, Playwright test agents paired with a reporting layer like TestDino give you a path out. Start with a solid seed test, let the agents handle the repetitive work, and use the data to stay confident that your suite actually means something.

Agents Build Tests. We Track Their Stability.

See how AI-generated and healed tests behave across CI runs, detect unstable patterns, and separate real bugs from noise.

Start TestDino Start TestDino

FAQs

What are Playwright test agents?

Playwright test agents are AI-driven components built into Playwright v1.56 and later. They consist of three agents: the Planner discovers test scenarios, the Generator creates executable Playwright code, and the Healer repairs broken tests automatically by interacting with a live browser session.

Can AI generate Playwright tests automatically?

Yes. The Generator agent reads markdown test plans, opens the live application, verifies selectors against the real DOM, and writes ready-to-run .spec.ts files with proper assertions and locator strategies. The output is production-grade code, not templates.

What is a Playwright test healer?

The Healer agent runs failing tests in debug mode, inspects page snapshots and console logs, identifies broken locators, updates the test code with stable alternatives, and re-runs to confirm. If the app itself is broken rather than the test, it skips the test instead of hiding the bug.

Do Playwright test agents work in CI/CD?

Yes. The tests generated by agents are standard Playwright tests that run in any CI system, including GitHub Actions, GitLab CI, and Jenkins. The agents themselves are interactive tools, but the tests they produce integrate into your pipeline like any other .spec.ts file.

How does TestDino improve Playwright test agent workflows?

TestDino tracks test stability, failure patterns, and flaky behavior across CI builds. It classifies failures into categories (actual bug, flaky, UI change) and provides historical context. For teams using playwright test agents, this means you can see whether healed tests stay stable, which agent-generated tests fail most, and where to focus manual review.

Vishwas Tiwari

AI/ML Developer

Vishwas Tiwari is an AI/ML Developer at TestDino with 1+ years of experience in test automation analytics and machine learning. He specializes in Playwright test automation and building ML models for error categorization and failure pattern detection.

Vishwas built TestDino's MCP server for test automation workflows and developed ML models using Python, Pandas, NumPy, and Scikit-learn that automate test data analysis and flaky test detection. He has authored 6+ technical guides on Playwright CI/CD integration, test failure analysis, and automation SOPs.

He holds a Bachelor Degree in Data Science from BMU University and contributes to open-source test automation tooling on GitHub.

View all posts →

Table of content

Flaky tests killing your velocity?

TestDino auto-detects flakiness, categorizes root causes, tracks patterns over time.

See Your Flakiest Tests

Playwright Test Agents: Planner, Generator and Healer Guide

What are Playwright test agents

Planner vs Generator vs Healer

Why Playwright introduced Planner, Generator, and Healer

How do Playwright test agents work?

How can AI plan, generate, and heal Playwright tests automatically?

How the Planner agent discovers test scenarios

How the Generator agent creates tests

How the Healer agent fixes tests

A quick example

Setting up Playwright test agents

Step 1 - Install the latest Playwright

Step 2 - Initialize the agents

Step 3 - Configure your seed test

Step 4 - Run the Planner

Step 5 - Generate tests

Step 6 - Heal and validate

Project structure after setup

Using Playwright test agents in CI/CD

Tracking stability of agent-generated tests

The workflow that works

Limitations you should know

Best practices for production teams

Conclusion

FAQs

Get started fast

Playwright Annotations Guide: Advanced Test Control Techniques

Testing Flask Apps with Playwright: Setup, CRUD Operations, and Best Practices

Playwright Assertions: A Guide to expect() and Test Automation

Playwright Test Agents: Planner, Generator and Healer Guide

What are Playwright test agents

Planner vs Generator vs Healer

Why Playwright introduced Planner, Generator, and Healer

How do Playwright test agents work?

How can AI plan, generate, and heal Playwright tests automatically?

How the Planner agent discovers test scenarios

How the Generator agent creates tests

How the Healer agent fixes tests

A quick example

Setting up Playwright test agents

Step 1 - Install the latest Playwright

Step 2 - Initialize the agents

Step 3 - Configure your seed test

Step 4 - Run the Planner

Step 5 - Generate tests

Step 6 - Heal and validate

Project structure after setup

Using Playwright test agents in CI/CD

Tracking stability of agent-generated tests

The workflow that works

Limitations you should know

Best practices for production teams

Conclusion

FAQs

Get started fast

Playwright Annotations Guide: Advanced Test Control Techniques

Testing Flask Apps with Playwright: Setup, CRUD Operations, and Best Practices

Playwright Assertions: A Guide to expect() and Test Automation

Join our waitlist