Best Playwright CI/CD Integrations: GitHub Actions, Jenkins, and GitLab CI
Set up Playwright CI/CD to run tests automatically on every commit with clear PR feedback and artifacts. Here’s how to integrate Playwright with GitHub Actions, Jenkins, and GitLab CI.
Running Playwright tests locally is easy. Running them automatically on every commit, across branches, with proper reporting? That's where things break down.
A solid Playwright CI/CD integration means your tests fire on every push, catch regressions before production, and give your team clear feedback on each pull request. This guide covers how to set up Playwright in the three most popular CI systems: GitHub Actions, Jenkins, and GitLab CI.
You'll walk away with:
-
Working YAML and pipeline examples you can copy into your project
-
A side-by-side comparison of each CI platform for Playwright
-
Practical tips for scaling Playwright test automation in CI
-
Strategies for tracking failures across builds with centralized reporting
Whether you're moving tests off your local machine for the first time or fixing a pipeline that keeps breaking, this article walks through what actually works.
What is CI/CD, and why does it matter for Playwright testing?
Definition: CI/CD stands for Continuous Integration and Continuous Delivery. CI automatically builds and tests your code every time someone pushes a change. CD takes it further by automating deployment to staging or production environments
For Playwright teams, CI/CD solves a specific problem. Without it, someone has to manually run npx playwright test before every merge. That works when you have 10 tests. It falls apart when you have 200.
Here's what a CI/CD pipeline does for your Playwright tests:
-
Automatic triggers. Tests run on every push or pull request. No one forgets to run them.
-
Consistent environment. CI runners use the same OS, browser versions, and dependencies every time. No more "works on my machine."
-
Fast feedback. Developers see pass/fail results in the PR within minutes, not after merging.
-
Artifact collection. Screenshots, traces, and videos get saved automatically when tests fail.
-
Parallel execution. Split your suite across multiple machines to cut run time from 30 minutes to under 5.
Without CI, test failures get caught late, regressions slip through, and your team loses trust in the test suite. With CI, every change is validated before it reaches the main branch.
Quick overview: GitHub Actions, Jenkins, and GitLab CI
Before jumping into setup, here's a fast look at each CI platform and what makes it different.
| GitHub Actions | Jenkins | ||
|---|---|---|---|
| What it is | GitHub's built-in CI/CD | Open-source automation server | GitLab's built-in CI/CD |
| Config file | .github/workflows/*.yml |
Jenkinsfile (Groovy DSL) |
.gitlab-ci.yml |
| Hosting | Cloud (GitHub-managed runners) | Self-hosted (you manage it) | Cloud or self-hosted |
| Setup effort | Low | High | Low–Medium |
| Best for | Teams on GitHub | Enterprise / custom infra | Teams on GitLab |
All three run Playwright the same way under the hood: install Node, install browsers, run npx playwright test, collect artifacts. The difference is in how they're configured and managed.
How do you run Playwright tests in GitHub Actions, Jenkins, and GitLab CI?
You run Playwright tests in CI by adding a pipeline config file that installs dependencies, sets up browsers, and executes npx playwright test on every push or pull request.
The core steps are identical across platforms:
-
Check out the code
-
Install Node.js and project dependencies
-
Install Playwright browsers and OS-level libraries
-
Run
npx playwright test -
Upload artifacts (reports, screenshots, traces)
The only difference? Where the config file lives and how each platform handles runners.
GitHub Actions Setup
GitHub Actions is the most common choice for teams already on GitHub. The workflow file lives at .github/workflows/playwright.yml.
name: Playwright Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
timeout-minutes: 60
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 18
- name: Install dependencies
run: npm ci
- name: Install Playwright browsers
run: npx playwright install --with-deps
- name: Run Playwright tests
run: npx playwright test
- name: Upload test artifacts
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: playwright-report/
retention-days: 30
A few things to know:
-
timeout-minutes: 60prevents runaway jobs from eating your quota -
--with-depsinstalls OS libraries (like libgtk) that Playwright browsers need on Linux -
if: always()on artifact upload means you get reports even when tests fail
Tip: Always use if: always() on the artifact upload step. Without it, failed test reports won't get uploaded, which is exactly when you need them most.
Jenkins Setup
Jenkins gives you more control but requires more setup. You define the pipeline in a Jenkinsfile, and Jenkins needs Docker or a Node.js agent with browser dependencies pre-installed.
// Jenkinsfile
pipeline {
agent {
docker {
image 'mcr.microsoft.com/playwright:v1.52.0-noble'
}
}
stages {
stage('Install') {
steps {
sh 'npm ci'
}
}
stage('Test') {
steps {
sh 'npx playwright test'
}
}
}
post {
always {
archiveArtifacts artifacts: 'playwright-report/**', allowEmptyArchive: true
publishHTML([reportDir: 'playwright-report', reportFiles: 'index.html', reportName: 'Playwright Report'])
}
}
}
Tip: Use Playwright's official Docker image (mcr.microsoft.com/playwright) as your Jenkins agent. It ships with all browsers and system dependencies pre-installed.
GitLab CI Setup
GitLab CI uses a .gitlab-ci.yml file at the root of your repo.
stages:
- test
playwright-tests:
stage: test
image: mcr.microsoft.com/playwright:v1.52.0-noble
script:
- npm ci
- npx playwright test
artifacts:
when: always
paths - playwright-report/
expire_in: 1 week
GitLab's built-in Docker support makes this clean. The Playwright Docker image handles all browser dependencies, and artifacts are stored in GitLab for direct access from the merge request page.
Tip: Set when: always on artifacts in GitLab CI. Failed test reports are more valuable than passing ones..
Which CI/CD platform works best for Playwright automation?
There's no single best platform. The right choice depends on where your code lives, how much infrastructure you want to manage, and what your team already uses.
Full Comparison Table
| Feature | GitHub Actions | Jenkins | GitLab CI |
|---|---|---|---|
| Docker support | Container jobs | Docker agents + plugin | Native Docker runner |
| Parallelism | Matrix strategy | Parallel stages + nodes | parallel keyword |
| Sharding | Matrix + merge job | Manual shard config | Built-in CI_NODE_INDEX |
| Artifact storage | GitHub Artifacts | Jenkins workspace | GitLab Artifacts |
| Free tier | 2,000 min/month (public repos unlimited) | Free (self-hosted) | 400 min/month |
| PR integration | Native check runs | Plugin-based | Native MR integration |
| Learning curve | Low | Medium–High | Low–Medium |
GitHub Actions
Strengths:
-
Clean YAML syntax, easy to read and write
-
Matrix strategies make parallelism simple
-
Massive ecosystem of reusable actions
-
Playwright's own docs use GitHub Actions as the primary CI example
Trade-offs:
-
Cost adds up at scale with sharded suites across multiple runners
-
Locked into GitHub's runner infrastructure unless you set up self-hosted runners
Jenkins
Strengths:
-
Full control over infrastructure and configuration
-
Plugin for virtually any integration
-
Works behind corporate firewalls and in regulated environments
Trade-offs:
-
You maintain the server, agents, and browser dependencies
-
Jenkins file syntax is more verbose than YAML alternatives
-
No managed runners, everything is on you
GitLab CI
Strengths:
-
Everything under one roof: CI, merge requests, container registry, artifacts
-
The
parallelkeyword makes Playwright sharding dead simple -
Native Docker runner support
Trade-offs:
-
Free tier is 400 minutes/month, which runs out fast with browser tests
-
Self-hosted runners solve the cost problem but add operational overhead
Note: All three platforms support Playwright's official Docker image (mcr.microsoft.com/playwright). Using Docker is the recommended approach regardless of which CI you choose. It guarantees consistent browser versions and eliminates dependency issues.
Configuring Playwright for CI (Step by Step)
Before writing any CI config, your Playwright setup needs to be CI-ready. Local configs often break in CI because of missing dependencies, hardcoded URLs, or tests that assume a running dev server.
Step 1 - Update playwright.config.ts for CI
Your config should detect CI and adjust settings automatically:
// playwright.config.ts
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
testDir: './tests',
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 1 : undefined,
reporter: [
['html'],
['junit', { outputFile: 'test-results/results.xml' }]
],
use: {
baseURL: process.env.BASE_URL || 'http://localhost:3000',
trace: 'on-first-retry',
screenshot: 'only-on-failure',
video: 'retain-on-failure'
},
});
Here's what each CI setting does:
| Setting | CI Value | Why It Matters |
|---|---|---|
forbidOnly |
true |
Blocks .only() from sneaking into CI and running just one test |
retries |
2 |
Catches flaky failures without manual reruns |
workers |
1 |
Prevents resource contention on shared CI runners |
trace |
on-first-retry |
Captures trace data on first failure without bloating storage |
screenshot |
only-on-failure |
Saves disk space while keeping what you need for debugging |
video |
retain-on-failure |
Records video but only stores it when tests fail |
Tip: Start with workers: 1 in CI. Once your suite is stable, bump to 2, then 4. Going straight to high parallelism often introduces flakiness that wastes more time than it saves.
Step 2 - Install Browsers and System Dependencies
Playwright browsers need system libraries (GTK, NSS, ALSA) that aren't on default CI images. Two options:
Option A: Playwright's Docker image (recommended)
image: mcr.microsoft.com/playwright:v1.52.0-noble
Option B: Manual install
npx playwright install --with-deps
Docker is better for reproducibility. It pins exact browser versions and includes all system libraries. Manual install works but depends on your CI runner's base OS.
Step 3 - Handle Secrets and Environment Variables
Never hardcode URLs or credentials. Use environment variables.
| Platform | Where to Store Secrets |
|---|---|
| GitHub Actions | Settings > Secrets and variables > Actions |
| Jenkins | Credentials plugin |
| GitLab | Settings > CI/CD > Variables |
Reference them in your pipeline:
# GitHub Actions example
env:
BASE_URL: ${{ secrets.STAGING_URL }}
TEST_USER: ${{ secrets.TEST_USER }}
Step 4 - Upload Artifacts for Debugging
When tests fail in CI, you need evidence. Configure your pipeline to always upload:
-
HTML report - the full interactive Playwright report
-
Screenshots - captured on failure
-
Traces - for deep debugging with Playwright Trace Viewer
-
Videos - if enabled in your config
Without artifacts, you're stuck re-running tests locally to reproduce failures. That's painful, especially for flaky tests that pass inconsistently.
Running Playwright Tests in Parallel Across CI
Serial execution is fine for small suites. Once you hit 50+ tests, it becomes a bottleneck. Playwright supports two levels of parallelism:
-
Workers - run multiple tests on the same machine
-
Sharding - split the suite across multiple CI machines
Workers vs Sharding
| Approach | How It Works | When to Use |
|---|---|---|
| Workers | Parallel processes on one machine | Suite runs under 10 min |
| Sharding | Distributes tests across machines | Suite runs over 10 min |
| Both | Shards + workers per shard | Large suites, 100+ tests |
GitHub Actions: Matrix Sharding
jobs:
test:
strategy:
fail-fast: false
matrix:
shard: [1/4, 2/4, 3/4, 4/4]
steps:
# ... setup steps
- name: Run tests
run: npx playwright test --shard=${{ matrix.shard }}
playwright-tests:
stage: test
image: mcr.microsoft.com/playwright:v1.52.0-noble
parallel: 4
script:
- npm ci
- npx playwright test --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
stage('Parallel Tests') {
parallel {
stage('Shard 1') {
steps { sh 'npx playwright test --shard=1/3' }
}
stage('Shard 2') {
steps { sh 'npx playwright test --shard=2/3' }
}
stage('Shard 3') {
steps { sh 'npx playwright test --shard=3/3' }
}
}
}
Tip: After sharding, you need to merge reports. Use Playwright's merge-reports command to combine blob reports from each shard into one HTML report. Without this, you'll have fragmented results..
Tracking Playwright CI Failures at Scale
Setting up CI is the easy part. The hard part? Figuring out what went wrong when tests fail across hundreds of tests, multiple branches, and different environments.
The Reporting Problem
Default Playwright CI output gives you pass/fail counts and error logs. That's fine for 20 tests. As your suite grows, you hit these walls:
-
Noisy logs. The actual failure is buried in hundreds of lines of output.
-
Disappearing artifacts. Most CI platforms delete them after a few days. Historical comparison gone.
-
Hidden flaky tests. A test failing 10% of the time doesn't look "broken." It just adds confusion.
-
No cross-run context. Is this failure new? Did it happen on the main too? One CI run can't tell you.
What Good CI Reporting Looks Like
It should answer four questions instantly:
-
Which tests failed and why?
-
Is this a new failure or a known flaky test?
-
Which branch and commit introduced it?
-
Is this failure environment-specific?
Teams that ship fast don't dig through CI logs. They have dashboards that surface answers automatically.
For teams running Playwright across multiple CI providers, TestDino centralizes results from GitHub Actions into a single reporting dashboard. The AI classifies each failure as Bug, Flaky, or UI Change with confidence scores, so you know where to start before opening a trace.
It also posts PR comments on GitHub with failure summaries. Reviewers see test health without leaving the pull request.
Tip: If you spend more than 15 minutes per day triaging CI failures, you've outgrown default reporting. Centralized reporting with failure classification pays for itself in engineering hours.
CI Pipeline Best Practices for Playwright
These patterns work consistently across teams scaling Playwright in CI.
Speed
-
Cache node_modules and Playwright browsers. Browser downloads are 200-400MB. Caching saves 1-2 minutes per run.
-
Use Playwright's Docker image. Skips browser install entirely.
-
Run only affected tests on PRs. Use test tags or file-path matching for relevant tests.
-
Shard when serial time exceeds 10 minutes. Sharding has overhead, so it's not worth it for fast suites.
Stability
-
Pin browser versions. Use a specific Docker tag (
v1.52.0-noble), notlatest. Prevents surprise browser updates. -
Set
forbidOnly: truein CI. One.only()in the codebase means CI runs exactly one test. -
Keep retries at 2. More than 3 hides real problems.
-
Start with low worker counts. Resource contention causes more flakiness than slow tests.
Debugging
-
Always upload artifacts. Screenshots, traces, and videos for every failed test.
-
Enable traces on first retry.
on-first-retrycaptures a full trace only when a test fails. Good balance of data and storage. -
Use JUnit reporters. Many CI platforms parse JUnit XML natively and show results in the UI.
Reporting
-
Don't rely on CI logs alone. At scale, you need a tool that tracks trends, groups failures, and shows history.
-
Track flaky test rates. Above 2% flaky rate means trust in CI is eroding. Measure it, fix it.
-
Connect to your issue tracker. Creating a Jira or Linear ticket from a CI failure should take one click, not five minutes of copy-pasting.
Conclusion
Running Playwright tests in CI isn't complicated. Pick the platform that matches your source control, add a config file, and your tests run on every push.
The real challenge is at scale. Hundreds of tests, multiple branches, different environments. Default CI output stops being useful. Failures stack up, flaky tests hide in the noise, and your team reads logs instead of fixing bugs.
Get the pipeline right first: Docker images, proper artifact collection, smart parallelism. Then invest in reporting that gives your team fast, clear answers.
TestDino is built for Playwright teams that need centralized CI reporting with AI failure analysis. You can try the sandbox or check the docs to see how it fits your workflow.
FAQ
Table of content
Flaky tests killing your velocity?
TestDino auto-detects flakiness, categorizes root causes, tracks patterns over time.