Cursor with Playwright: Generate, Run & Fix Tests with MCP

Connect Cursor to Playwright via MCP for real browser context during test generation. Full setup: mcp.json config, Playwright Skills, .cursorrules, CLI batch generation, and TestDino reporting.

Cursor with Playwright is where the AI stops guessing. Without Playwright MCP connected, Cursor's agent works from your codebase alone , no idea what your app actually looks like. Connect MCP and the AI reads the live accessibility tree, sees real DOM state, and generates locators that work.

The difference isn't subtle. AI without browser context writes tests that pass against documentation examples and fail against your app. This guide sets up the full stack , MCP, CLI, Skills, .cursorrules , and skips everything that doesn't directly improve test quality.

TL;DR
  • Connect Playwright MCP via .cursor/mcp.json for live browser access

  • Use Playwright CLI for batch sessions (75% fewer tokens than MCP)

  • Load Skills with npx skills add testdino-hq/playwright-skill. This loads 70+ production Playwright guides

  • Write your own .cursorrules, don't copy someone else's

  • MCP is better for debugging than bulk generation. Use Codegen or CLI for recording flows

  • Run with npx tdpw test for centralized reporting and flaky detection

Prerequisites

terminal
node --version   # 18+
npm --version    # 8+

  • Cursor IDE installed (cursor.com/downloads)

  • A Playwright project with at least 1 passing test

  • Playwright browsers installed: npx playwright install --with-deps

Step 1: Connect Playwright MCP to Cursor

Quick setup: Click Add Playwright MCP to Cursor. Done. Restart Cursor.

Manual setup:

terminal
npm install --save-dev @playwright/mcp

Create .cursor/mcp.json:

.cursor/mcp.json
{
  "mcpServers": {
    "playwright": {
      "command""npx",
      "args": ["@playwright/mcp@latest"]
    }
  }
}

.cursor/.mcp.json file, configuring the Playwright MCP server using npx in Cursor.

Restart Cursor. Go to Settings > Tools & MCP. The playwright server should appear with a green status.

Verify: Ask Cursor: "Open playwright.dev and tell me the first H1." If the browser opens and Cursor returns the heading, you're live.

For the full installation walkthrough with troubleshooting for every error, see the Playwright MCP Cursor installation guide.

Play

Two config options worth knowing

Snapshot vs vision mode. Snapshot (default) reads the accessibility tree. It gives, semantic, accurate. Vision mode uses coordinates for canvas elements and custom-drawn UI. Most projects never need vision mode.

.cursor/mcp.json (vision mode)
{
  "mcpServers": {
    "playwright": {
      "command""npx",
      "args": ["@playwright/mcp@latest""--caps=vision"]
    }
  }
}

Persistent vs isolated sessions. Default keeps login state between sessions. Add --isolated for a clean slate each run:

.cursor/mcp.json (isolated)
{
  "args": ["@playwright/mcp@latest""--isolated"]
}

Step 2: Install Playwright CLI for batch generation

MCP streams the full accessibility tree on every response. One session can hit 114,000 tokens. For generating 3+ tests, switch to CLI.

terminal
npm install -g @playwright/cli@latest
playwright-cli install
npx skills add testdino-hq/playwright-skill/playwright-cli

Playwright CLI installation output showing successful setup

Method Tokens/session Use when
Playwright MCP ~114,000 Single test, live DOM inspection, debugging
Playwright CLI ~27,000 Batch generation, 3+ tests per session
Playwright Codegen 0 Recording a known flow, selector discovery

Step 3: Load Playwright Skills

Skills are curated markdown guides that teach the AI production Playwright patterns. Without them, the AI writes from public documentation examples , CSS selectors, no auth handling, no fixture isolation. With Skills loaded, it writes getByRole, proper storageState auth, and fixture isolation from the start.

Playwright Skills repository showing the 5 pack folders

terminal
# All 70+ guides at once
npx skills add testdino-hq/playwright-skill

# Or by pack
npx skills add testdino-hq/playwright-skill/core        # 46 locator/assertion/auth guides
npx skills add testdino-hq/playwright-skill/ci          # 9 GitHub Actions / GitLab CI guides
npx skills add testdino-hq/playwright-skill/playwright-cli  # 11 CLI automation guides

Pack Guides Covers
core/ 46 Locators, assertions, waits, auth, fixtures, POM
playwright-cli/ 11 CLI browser automation
pom/ 2 Page Object Model patterns
ci/ 9 GitHub Actions, GitLab CI, parallel execution
migration/ 2 Moving from Selenium

The repo is MIT licensed. Fork it and add your team's patterns: your auth flow, your component library's ARIA structure, your internal helpers. Your fork works the same way as the base Skills.

Step 4: Write your own .cursorrules

Don't download rule sets from GitHub. Generic rules don't fit your project, and they stop you learning which instructions your AI editor actually responds to. Build yours organically: when Cursor makes a mistake, add a rule. When you repeat the same instruction in 3 prompts, turn it into a rule.

Start here and expand from what breaks:

.cursorrules
# Playwright rules , [your project name]

## Locators
- Use getByRole, getByTestId, or getByLabel
- Never use CSS selectors, XPath, or page.locator() with class strings
- Check the live accessibility tree via MCP before choosing a locator

## Structure
- 1 file per feature or user flow
- Wrap related tests in describe() blocks
- Keep each test under 30 lines
- Name files as feature-name.spec.ts

## Timing
- Never use page.waitForTimeout() or fixed delays
- Use waitForLoadState('networkidle') for page transitions
- Use element.waitFor({ state: 'visible' }) for element state

## Test data
- All credentials and test data via fixtures or env vars
- Tests must pass in isolation, no dependency on prior tests
- Use storageState for auth , never log in through the UI per test

## Assertions
- Assert the final outcome, not intermediate DOM states
- Use toBeVisible, toHaveText, toHaveURL

## Output
- Return diffs, not full files
- Add a 1-line comment at the top explaining what the test covers

.cursorrules file open in Cursor with the rules visible

Commit this to version control. Every developer gets the same AI behavior.

Cursor custom commands. Beyond rules (always loaded), Cursor supports slash commands stored in .cursor/commands/. Create a generate-test command that defines your Arrange-Act-Assert workflow, references existing spec files, and invokes Skills before generating. Type / in Cursor chat and select "Create command."

Screenshots as context. When a locator isn't working, paste a screenshot of the page alongside your failing test directly into Cursor chat. The AI identifies the element faster from a visual than from a text description. Open DevTools to the Elements panel before screenshotting , the AI will use the HTML structure too.

Choose the right model

Model selector dropdown in Cursor showing Claude Sonnet, Opus, and Composer options

Model Speed Multi-file accuracy Cost (in/out per 1M tokens) Best for
Cursor Composer 1.5 Very fast Good Included Single-test iteration, quick edits
Claude Sonnet 4.6 Moderate High $3 / $15 Multi-file generation, complex flows
Claude Opus 4.6 Slower Highest $5 / $25 Large fixture refactors, suite rewrites
GPT-5.2 Codex Fast Good $6 / $30 Scaffolding, CI config, code review

Sonnet scores 79.6% on SWE-bench at roughly a third of Opus's cost. For most Playwright generation work, it's the right default. Switch to Composer for quick single-spec iteration, Opus only when refactoring across many files.

Expert Insight: Staying on one model long enough to understand how it responds to your prompts is worth more than chasing each new release. Cursor's "auto" mode handles selection reasonably if you'd rather not manage this.

Generate tests

With Playwright MCP: use when you need to see the live page

MCP is genuinely better for debugging and inspection than bulk generation. For the first test in a new feature area, or anything with complex auth, it's the right choice.

Prompt:

Cursor prompt
Generate a Playwright test for the login flow on https://storedemo.testdino.com.
Use Playwright MCP to inspect the page.
- Navigate to the site, open the login page
- Sign in with env vars (STOREDEMO_EMAIL, STOREDEMO_PASSWORD)
- Verify the user is logged in by checking the dashboard heading
Use getByRole or getByLabel. Follow .cursorrules.

Generated test:

tests/auth/login-flow.spec.ts
// tests/auth/login-flow.spec.ts
// Covers: successful login and dashboard access

import { testexpect } from '@playwright/test';

test.describe('Login', () => {
  test('user can sign in with valid credentials'async ({ page }) => {
    const email = process.env.STOREDEMO_EMAIL;
    const password = process.env.STOREDEMO_PASSWORD;

    if (!email || !password) {
      throw new Error('Set STOREDEMO_EMAIL and STOREDEMO_PASSWORD in .env');
    }

    await page.goto('/login');
    await page.getByLabel(/email/i).fill(email);
    await page.getByLabel(/password/i).fill(password);
    await page.getByRole('button', { name: /sign in/i }).click();
    await page.waitForLoadState('networkidle');

    await expect(page.getByRole('heading', { name: /dashboard/i })).toBeVisible();
  });
});

Expected output:

terminal
  ✓  tests/auth/login-flow.spec.ts › Login › user can sign in (1.8s)
  1 passed (3.2s)

Cursor terminal showing a passing Playwright test output

Notice: getByLabel for form fields, getByRole('button') for the action, getByRole('heading') for the assertion. No CSS. No IDs. This test survives a full UI redesign as long as the semantic structure holds.

Playwright config to match:

playwright.config.ts
// playwright.config.ts
import { defineConfigdevices } from '@playwright/test';

export default defineConfig({
  testDir'./tests',
  fullyParalleltrue,
  forbidOnly: !!process.env.CI,
  retriesprocess.env.CI ? 2 : 0,
  workersprocess.env.CI ? 1 : undefined,
  use: {
    baseURL'https://storedemo.testdino.com',
    trace'on-first-retry',
  },
  projects: [
    { name: 'chromium', use: { ...devices['Desktop Chrome'] } },
  ],
});

With Playwright CLI: use for batch sessions (75% fewer tokens)

Cursor prompt
Using Playwright CLI, generate tests for the complete checkout flow on
https://storedemo.testdino.com. 4 steps: add to cart, proceed to checkout,
enter shipping, confirm order. 1 test per step. Follow .cursorrules.
Return diffs only.

Expected output:

terminal
  ✓  tests/checkout/add-to-cart.spec.ts (2.1s)
  ✓  tests/checkout/proceed-to-checkout.spec.ts (1.9s)
  ✓  tests/checkout/shipping-details.spec.ts (3.4s)
  ✓  tests/checkout/confirm-order.spec.ts (2.8s)
  4 passed (6.1s)

With Codegen + AI cleanup (fastest for known flows

terminal
# Record the flow , zero AI tokens
npx playwright codegen https://storedemo.testdino.com

Then in Cursor: "Clean up this recorded test to follow our .cursorrules. Replace CSS selectors with getByRole or getByTestId. Add fixture isolation."

Speed from recording, quality from AI refinement.

Common errors and fixes

MCP server doesn't start

terminal
node --version                         # Must be 18+
npx playwright install --with-deps     # Install browser binaries
cat .cursor/mcp.json | python3 -m json.tool  # Validate JSON syntax

Then restart Cursor completely, not just "Reload Window." For the full error catalog see the Playwright MCP troubleshooting guide.

Tests still generating CSS selectors despite .cursorrules

The rules file is in the wrong directory, or it's too long and gets truncated. Move Playwright-specific rules to a scoped file:

.cursor/rules/playwright.md
<!-- .cursor/rules/playwright.md -->
---
glob: "**/*.spec.ts"
---

- Use getByRole, getByTestId, or getByLabel
- Never use CSS selectors or XPath

Tests pass locally, fail in CI

Missing browser install and missing env vars are the two causes that account for 90% of these failures.

.github/workflows/playwright.yml
# .github/workflows/playwright.yml
nameInstall Playwright browsers
  runnpx playwright install --with-deps

nameRun tests
  runnpx playwright test --trace on
  env:
    STOREDEMO_EMAIL${{ secrets.STOREDEMO_EMAIL }}
    STOREDEMO_PASSWORD${{ secrets.STOREDEMO_PASSWORD }}

nameUpload report
  ifalways()
  usesactions/upload-artifact@v4
  with:
    nameplaywright-report
    pathplaywright-report/

Always run with --trace on in CI. When a test fails, the trace gives you DOM snapshots, network requests, and console logs from the moment of failure without needing to reproduce locally.

Common Mistake: Adding page.waitForTimeout(2000) to fix a CI timing failure. That masks the real problem. Use element.waitFor({ state: 'visible' }) , the test waits exactly as long as it needs to and fails fast when the element genuinely doesn't appear.

See the Playwright parallel execution guide for sharding across multiple CI machines.

WebKit passes locally, fails in CI

WebKit has stricter timing for CSS animations. Replace waitForLoadState with element-level waits:

payment-form.spec.ts
// Replace this
await page.waitForLoadState('networkidle');

// With this
await page.locator('[data-testid="payment-form"]').waitFor({ state'visible' });

Pre-merge checklist

checklist
[ ] Passes locally: npx playwright test path/to/spec.ts --headed
[ ] Locators are semantic , no CSS classes, no IDs
[ ] No hardcoded credentials , all from .env or fixtures
[ ] No page.waitForTimeout() in the file
[ ] Passes in isolation , no dependency on other tests running first
[ ] Runs in CI with --trace on

Run and report with TestDino

Once your suite grows past a handful of specs, HTML reports stop being useful. You need to know what broke, when it started breaking, and whether it's recurring.

terminal
npm install @testdino/playwright
npx tdpw test --token "your_token_here"

Terminal output:

terminal
Running 24 tests using 4 workers

  ✓  tests/auth/login-flow.spec.ts (1.8s)
  ✗  tests/checkout/payment-form.spec.ts (4.2s)
  ...

  22 passed, 1 failed, 1 flaky (8.4s)
Dashboard: https://app.testdino.com/runs/run-abc123

Results stream in real time. You see failures as they happen, not after the suite finishes. The TestDino dashboard gives you error grouping (50 duplicate failures shown as 1 pattern), the embedded trace viewer (no downloading zip files), and flaky test detection with run-over-run confidence scores.

For CI:

.github/workflows/playwright.yml
nameUpload to TestDino
  ifalways()
  runnpx tdpw upload ./playwright-report --token="${{ secrets.TESTDINO_TOKEN }}" --upload-html

Every Playwright CLI flag works unchanged with tdpw.

Fix flaky tests: TestDino MCP + Cursor Healer

Playwright 1.56 introduced the Healer agent, which repairs failing tests. Its blind spot: it only sees the current UI state , not whether a test has been failing intermittently for 3 weeks or only on WebKit. TestDino MCP fills that gap.

Add TestDino MCP to .cursor/mcp.json:

.cursor/mcp.json
{
  "mcpServers": {
    "playwright": { "command""npx""args": ["@playwright/mcp@latest"] },
    "TestDino": {
      "command""npx",
      "args": ["-y""testdino-mcp"],
      "env": { "TESTDINO_PAT""your-token-here" }
    }
  }
}

The loop:

flaky-fix-workflow
1. CI reports a failure. TestDino classifies it: "Flaky, 85% confidence"

2. In Cursor:
   "Using TestDino MCP, show me the last 20 runs of payment-form.spec.ts.
    Failure rate, which browsers, and the recurring error."

3. TestDino MCP returns:
   "Fails on WebKit 6/20 runs. Error: element not visible.
    Payment form slide-in animation timing."

4. Feed to Healer:
   "Fix payment-form.spec.ts. Flaky on WebKit , payment form
    animation causes element-not-visible. Use Playwright MCP
    to inspect WebKit and add the correct wait. No waitForTimeout."

5. Healer opens WebKit, finds the animation, adds
   waitFor({ state: 'visible' }), reruns until stable, returns diff.

6. Review. Commit.

Without historical data the Healer guesses. With it, fixes target the actual root cause. More in fixing flaky tests with Playwright MCP.

Cursor vs Claude Code vs Copilot vs Windsurf

Feature Cursor Claude Code GitHub Copilot Windsurf
Playwright MCP Yes, native Yes, deep per-agent Yes, via VS Code Yes, marketplace
Multi-model Yes Anthropic only OpenAI primarily Limited
Rules file .cursorrules CLAUDE.md copilot-instructions.md Cascade rules
Custom commands .cursor/commands/ Slash commands Limited Limited
Tab completion Yes (Supermaven) No Yes Yes
TestDino MCP Yes Yes Via extension Limited
Best for Interactive IDE, debugging Terminal, large refactors Teams already on VS Code Guided flows

Claude Code with Playwright is stronger for terminal-driven workflows and large codebases. Copilot works if you're already in VS Code and don't need the MCP browser-control layer.

FAQ

What is Cursor with Playwright?
The Cursor editor connected to Playwright through MCP, giving AI agents real browser control. Tests are generated from live DOM state, not training data guesses.
When to use MCP vs Codegen?
Codegen records your manual actions, costs 0 AI tokens, and is fastest for flows you can click through yourself. MCP is for flows where the AI needs to reason about the page , complex auth, dynamic state, anything you can't easily record by hand.
Should I download .cursorrules from GitHub?
No. Generic rules don't fit your project, and writing your own teaches you how your AI editor actually responds to instructions. Start with the foundation in this guide and add rules from real mistakes.
Why does .cursorrules not conflict with Playwright Skills?
They operate at different layers. .cursorrules is a hard constraint. The AI treats rules as mandatory. Skills are reference knowledge the AI draws from. Rules win if they conflict.
Why does a generated test fail on WebKit but pass on Chromium?
WebKit has stricter timing for CSS animations. Replace waitForLoadState with element.waitFor({ state: 'visible' }). Run with --project=webkit locally to reproduce.
Does Playwright MCP run headless?
Add "--headless" to your args in mcp.json. For debugging sessions, headed mode is better , you can watch what the AI is doing in real time.
Dhruv Rai

Product & Growth Engineer

Dhruv Rai is a Product and Growth Engineer at TestDino, focusing on developer automation and product workflows. His work involves building solutions around Playwright, CI/CD, and developer tooling to improve release reliability.

He contributes through technical content and product initiatives that help engineering teams adopt modern testing practices and make informed tooling decisions.

Get started fast

Step-by-step guides, real-world examples, and proven strategies to maximize your test reporting success