Playwright Skill Claude Code: 82 E2E Tests for E-commerce
Generate, run, debug & ship Playwright E2E tests for e-commerce using open-source Skill in Claude Code. Integrate TestDino reporting for AI insights.
What is a Skill?
A Skill is a structured knowledge base, a curated collection of markdown guides, patterns, best practices, and ready-to-use code examples, packaged in a way that AI coding agents can read and apply.
It is not a library you import into your code. It is not a framework you install. It is a teaching material for AI agents. When you give a Skill to an AI agent, you are essentially handing it a handbook that says: "This is how we write tests. These are our patterns. These are the mistakes to avoid. Follow this."
Without a Skill, an AI agent relies on its general training data and produces generic output. With a Skill, it produces output that matches production-grade standards from the very first attempt.
What is Claude Code?
Claude Code is Anthropic's official CLI-based AI coding agent. You open it in your terminal, describe what you want in plain English, and it does the work. It reads files, writes code, runs commands, sees errors, debugs them, and iterates until the task is done.
What makes it relevant here is that Claude Code can read Skill files placed in your project directory. When you ask it to write Playwright tests, and a Playwright Skill is available in the project, Claude Code reads the relevant guides from that Skill and generates code that follows those exact patterns.
You describe the task. The Skill provides the knowledge. Claude Code does the execution. That is the workflow.
The Power Trio: Skill + CLI + Claude Code
Most developers think of AI code generation as a one-shot process: you give it a prompt, it gives you code, you copy-paste it, and you hope it works. That workflow is fragile. It breaks the moment something unexpected happens.
The Skill + CLI + Claude Code workflow is fundamentally different. Here is why.
The Skill provides domain expertise. It contains 70+ guides covering every Playwright testing scenario, each guide explaining when to use a pattern, when not to use it, and providing working code in both TypeScript and JavaScript. This is the "brain" of the operation.
Claude Code (the CLI agent) does the actual work. It reads your project files, reads the Skill, explores the target website, generates test code, runs the tests, reads the error output, and fixes failures. All inside your terminal, all in one session. This is the "hands" of the operation.
The combination creates a feedback loop that solo prompting cannot achieve:
You describe the task
→ Claude Code reads the Skill for the right patterns
→ Generates tests based on those patterns
→ Runs the tests
→ Reads failures and error output
→ Fixes the code
→ Runs again until passing
This is not theoretical. We did exactly this, and the rest of this article walks through every step.
The Playwright Skill by TestDino
The Skill we used is the Playwright Skill, an open-source project maintained by TestDino and available on GitHub:
Repository: github.com/testdino-hq/playwright-skill
It is the most comprehensive collection of Playwright testing guides available today: 70+ guides organized into 5 skill packs, covering everything from basic locators to Electron app testing to CI/CD pipeline configuration.
How to Install the Skill
🎥Watch: Installing the Playwright Skill with Claude Code
You can install it with a single command using the skills CLI:
Install the entire Skill (all 5 packs):
npx skills add testdino-hq/playwright-skill
Or install individual packs based on what you need:
npx skills add testdino-hq/playwright-skill/core # 46 guides: locators, assertions, auth, mocking, debugging
npx skills add testdino-hq/playwright-skill/ci # 9 guides: GitHub Actions, GitLab, Docker, sharding
npx skills add testdino-hq/playwright-skill/pom # 2 guides: Page Object Model patterns
npx skills add testdino-hq/playwright-skill/migration # 2 guides: from Cypress, from Selenium
npx skills add testdino-hq/playwright-skill/playwright-cli # 11 guides: CLI browser automation
Once installed, the Skill files live inside your project directory. Claude Code (or any other AI agent that reads project files) will automatically pick them up.
What is Inside the Skill
The repository is organized into 5 packs:
playwright-skill/
├── SKILL.md ← Entry point: 10 Golden Rules + full guide index
├── core/ ← 46 guides: the foundation of Playwright testing
├── ci/ ← 9 guides: CI/CD pipelines for every platform
├── pom/ ← 2 guides: Page Object Model patterns
├── migration/ ← 2 guides: switching from Cypress or Selenium
└── playwright-cli/ ← 11 guides: CLI-based browser automation
The SKILL.md is the main entry point. It contains 10 Golden Rules that every guide follows:
- getByRole() over CSS/XPath, always
- Never use page.waitForTimeout()
- Web-first assertions that auto-retry
- Isolate every test, no shared state
- baseURL in config, zero hardcoded URLs
- Retries: 2 in CI, 0 locally
- Traces: 'on-first-retry'
- Fixtures over globals
- One behavior per test
- Mock external services only, never mock your own app
Every guide in the Skill follows a consistent structure:
- "When to use" tells you the exact scenario where this pattern applies
- "Avoid when" warns you about anti-patterns
- Quick Reference provides copy-paste code for immediate use
- Patterns walk through full real-world implementations
- Both TypeScript and JavaScript examples are included in every guide
Every Test Type This Skill Can Help You Create
This is the complete list. If a testing scenario exists in Playwright, there is a guide for it here.
Core Testing (46 Guides)
Fundamentals
| Test Type | Guide | What You Will Learn |
|---|---|---|
| Selector-based element finding | core/locators.md + core/locator-strategy.md | The priority order: getByRole, getByLabel, getByText, getByTestId, CSS as last resort |
| Assertions and auto-waiting | core/assertions-and-waiting.md | Web-first assertions that retry vs. snapshot assertions that do not |
| Test suite organization | core/test-organization.md + core/test-architecture.md | File naming, grouping by feature, tagging, when to use E2E vs component vs API tests |
| Playwright configuration | core/configuration.md | Complete playwright.config.ts setup from scratch |
| Custom fixtures and hooks | core/fixtures-and-hooks.md | test.extend(), worker-scoped fixtures, setup/teardown lifecycle |
| Test data strategies | core/test-data-management.md | Factories, database seeding, cleanup between tests |
Authentication and Forms
| Test Type | Guide | What You Will Learn |
|---|---|---|
| Login and session management | core/authentication.md + core/auth-flows.md | Storage state reuse, API-based login, multi-role auth, OAuth, SSO, MFA |
| Form testing and validation | core/forms-and-validation.md | Input types, validation error messages, multi-step form wizards |
API and Network
| Test Type | Guide | What You Will Learn |
|---|---|---|
| REST and GraphQL API testing | core/api-testing.md | APIRequestContext for headless API tests, no browser needed |
| Network mocking and interception | core/network-mocking.md + core/when-to-mock.md | Route interception, HAR replay, when to mock vs. hit real services |
Visual and Accessibility
| Test Type | Guide | What You Will Learn |
|---|---|---|
| Visual regression testing | core/visual-regression.md | Screenshot comparison, pixel thresholds, masking dynamic content |
| Accessibility auditing | core/accessibility.md | axe-core integration, ARIA assertions, WCAG compliance checks |
UI Interactions
| Test Type | Guide | What You Will Learn |
|---|---|---|
| CRUD flow testing | core/crud-testing.md | Create, read, update, delete patterns end-to-end |
| Drag and drop | core/drag-and-drop.md | Native HTML5 drag-and-drop and custom implementations |
| Search, filter, and pagination | core/search-and-filter.md | Testing search bars, filter dropdowns, paginated lists, sort toggles |
| File uploads and downloads | core/file-operations.md + core/file-upload-download.md | File input handling, download verification, file type checks |
| Error states and edge cases | core/error-and-edge-cases.md | Empty states, network failures, timeouts, boundary conditions |
Specialized Browser Features
| Test Type | Guide | What You Will Learn |
|---|---|---|
| iframes and Shadow DOM | core/iframes-and-shadow-dom.md | Cross-origin iframes, piercing Shadow DOM boundaries |
| Multi-tab and popup windows | core/multi-context-and-popups.md | New tab handling, popup windows, multi-context tests |
| WebSockets and real-time UIs | core/websockets-and-realtime.md | WebSocket interception, SSE streams, live-updating dashboards |
| Browser APIs | core/browser-apis.md | Geolocation, clipboard, permissions, notifications |
| Canvas and WebGL | core/canvas-and-webgl.md | Testing canvas-rendered content and WebGL applications |
| Service workers and PWAs | core/service-workers-and-pwa.md | Offline mode, service worker interception, PWA install prompts |
| Clock and time mocking | core/clock-and-time-mocking.md | Fake timers, frozen dates, controlling Date.now() |
| Multi-user collaboration | core/multi-user-and-collaboration.md | Two browsers in one test for real-time collaboration testing |
| Internationalization (i18n) | core/i18n-and-localization.md | Locale switching, RTL layouts, translation verification |
| Mobile and responsive | core/mobile-and-responsive.md | Device emulation, viewport breakpoints, touch interactions |
Security and Performance
| Test Type | Guide | What You Will Learn |
|---|---|---|
| Security testing | core/security-testing.md | XSS injection checks, CSRF tokens, cookie flags, HTTP headers |
| Performance benchmarks | core/performance-testing.md | Core Web Vitals, Lighthouse integration, performance metrics |
Platform Testing
| Test Type | Guide | What You Will Learn |
|---|---|---|
| Electron desktop apps | core/electron-testing.md | Launching and testing Electron applications |
| Browser extensions | core/browser-extensions.md | Extension popup and background page testing |
| Third-party integrations | core/third-party-integrations.md | Stripe, Auth0, Firebase, and other external services |
Framework-Specific Recipes
| Framework | Guide |
|---|---|
| Next.js (App Router + Pages Router) | core/nextjs.md |
| React (CRA, Vite) | core/react.md |
| Vue 3 and Nuxt | core/vue.md |
| Angular | core/angular.md |
Debugging and Fixing
| Problem | Guide |
|---|---|
| Systematic debugging workflow | core/debugging.md (UI Mode, Trace Viewer, Inspector, page.pause()) |
| Error message lookup | core/error-index.md (error message to fix, quick reference) |
| Flaky and intermittent tests | core/flaky-tests.md (root causes and concrete stabilization patterns) |
| Common beginner mistakes | core/common-pitfalls.md (what to avoid and why) |
CI/CD and Infrastructure (9 Guides)
| Topic | Guide |
|---|---|
| GitHub Actions | ci/ci-github-actions.md |
| GitLab CI | ci/ci-gitlab.md |
| CircleCI, Azure DevOps, Jenkins | ci/ci-other.md |
| Parallel execution and sharding | ci/parallel-and-sharding.md |
| Docker and containers | ci/docker-and-containers.md |
| Reports and artifacts | ci/reporting-and-artifacts.md |
| Code coverage | ci/test-coverage.md |
| Global setup and teardown | ci/global-setup-teardown.md |
| Multi-project configuration | ci/projects-and-dependencies.md |
Playwright CLI (11 Guides)
| Topic | Guide |
|---|---|
| Core commands (open, click, fill, keyboard) | playwright-cli/core-commands.md |
| Request mocking from CLI | playwright-cli/request-mocking.md |
| Running custom Playwright code | playwright-cli/running-custom-code.md |
| Session management | playwright-cli/session-management.md |
| Storage and auth state | playwright-cli/storage-and-auth.md |
| Test generation from interactions | playwright-cli/test-generation.md |
| Tracing and debugging | playwright-cli/tracing-and-debugging.md |
| Screenshots and media | playwright-cli/screenshots-and-media.md |
| Device emulation | playwright-cli/device-emulation.md |
| Advanced workflows | playwright-cli/advanced-workflows.md |
Page Object Model (2 Guides)
| Topic | Guide |
|---|---|
| POM implementation patterns | pom/page-object-model.md |
| POM vs fixtures vs helpers decision guide | pom/pom-vs-fixtures-vs-helpers.md |
Migration (2 Guides)
| From | Guide |
|---|---|
| Cypress | migration/from-cypress.md (command-by-command mapping, 5 mindset shifts) |
| Selenium / WebDriver | migration/from-selenium.md (API mapping, eliminates explicit waits) |
That is 70+ guides covering every Playwright testing scenario you will encounter in production.
Practical Example: E2E Testing on a Live Store
Knowing what the Skill contains is one thing. Seeing it in action is another. We took the Skill, loaded it into Claude Code, and ran a complete E2E testing workflow against a live e-commerce website. Here is exactly what happened.
Step 1: Understanding the Target Website
The target was storedemo.testdino.com, a live demo e-commerce store selling electronics and gadgets. It is a JavaScript single-page application with 24 pages:
- Homepage with hero banner, product categories (Audio & Camera, Appliances, Gadgets, PC & Laptops), and a featured product carousel
- Products page with a grid listing of all 14 products
- 14 individual product detail pages for items ranging from $15 (SanDisk USB Drive) to $1,560 (Apple iPad)
- Static pages including About Us, Contact Us, and FAQ
- Policy pages including Shipping, Return, Cancellation, Privacy, and Terms of Service
- Transactional pages including Cart, Checkout, and User Account
We discovered all of this by examining the sitemap at /sitemap.xml and the robots.txt file, which also revealed hidden admin and API routes.
Step 2: Asking Claude Code to Generate Tests
With the Playwright Skill installed in the project, we opened Claude Code and gave it a straightforward instruction:
Create a new folder with E2E tests for storedemo.testdino.com
That was the entire prompt. No additional configuration. No specifying which patterns to use. No listing out test cases manually.
Claude Code then did something important: it read the relevant guides from the Skill before writing a single line of code. Specifically, it pulled patterns from core/visual-regression.md, core/locators.md, core/assertions-and-waiting.md, and core/configuration.md. This is the Skill doing its job. The AI did not guess. It followed documented patterns.
Step 3: What Claude Code Generated
Within about a minute, Claude Code had created a complete project from scratch:
visual-testing/
├── package.json # Dependencies and npm scripts
├── playwright.config.ts # Full configuration with 3 viewport projects
├── tsconfig.json # TypeScript configuration
└── tests/
├── homepage.visual.spec.ts # 6 tests
├── products-listing.visual.spec.ts # 5 tests
├── product-detail.visual.spec.ts # 20 tests (parameterized across 4 products)
├── navigation.visual.spec.ts # 12 tests
├── static-pages.visual.spec.ts # 19 tests
├── responsive.visual.spec.ts # 15 tests
└── cart-and-checkout.visual.spec.ts # 5 tests

82 tests across 7 files, all generated without us specifying individual test cases.
The Skill's influence was visible in every decision Claude Code made:
- It set animations: 'disabled' globally because the Skill's golden rules warn that CSS animations cause flaky screenshot diffs
- It configured maxDiffPixelRatio: 0.01 because the Skill recommends allowing 1% pixel variance for anti-aliasing
- It used element-level screenshots for individual components (header, footer, product cards) instead of relying on full-page screenshots alone
- It added graceful fallback selectors because the Skill's best practices warn about unknown DOM structures on third-party sites
- It created three projects (desktop at 1440x900, mobile iPhone 14, tablet iPad) for cross-device coverage
None of this was in our prompt. All of it came from the Skill.
Step 4: Running the Tests for the First Time
We started with the homepage tests to validate the approach before running the full suite. The commands:
cd visual-testing
npm install
npx playwright install chromium
npx playwright test tests/homepage.visual.spec.ts --project=desktop-chrome --update-snapshots
The --update-snapshots flag is essential on the first run. It tells Playwright to save the current screenshots as baseline images instead of comparing against non-existent ones.
Result: 5 passed, 1 failed.
5 passed
1 failed
✗ header section
locator('header').first() - element(s) not found
Four tests generated their baseline screenshots successfully. Two tests could not find the elements they were looking for.
Step 5: Debugging the Failures
This is where the real value of the Skill + Claude Code workflow shows up. Instead of us manually investigating, Claude Code did the debugging itself.
What went wrong?
The generated tests assumed the site used semantic HTML tags like <header>, <footer>, and <nav>. This is standard practice for well-built websites. But this particular React SPA renders everything as generic <div> elements with no semantic tags.
How Claude Code found the problem
First, it examined the failure screenshot that Playwright automatically captures when a test fails. The screenshot showed the page had loaded perfectly. The navigation bar, hero banner, product categories, everything was visible and correct.
The page was not broken. The selectors were wrong.
Next, Claude Code read the accessibility tree snapshot that Playwright dumps alongside the failure. This is a structured representation of the actual DOM elements on the page. It revealed the truth:
Table of content
Flaky tests killing your velocity?
TestDino auto-detects flakiness, categorizes root causes, tracks patterns over time.