Playwright Tests in Azure DevOps: Complete Reporting Guide
Tired of downloading zip files just to see why a Playwright test failed? Here’s why Azure DevOps’s native reporting falls short and what to do about it.

To ensure automated software testing yields expected results, it is not enough to write test cases and execute them in continuous integration. The visibility of those results, the ease of diagnosing failures, and the ability to track test flakiness dictate the speed of your delivery pipeline.
When running Playwright in Azure DevOps, engineers often encounter a gap between executing tests and understanding why they failed.
The default setup provides basic test counts, leaving developers to dig through compressed build artifacts to debug failures.
We will review how to configure Playwright in Azure DevOps pipelines, analyze where native reporting falls short, and evaluate the tools available to build a highly visible, automated reporting setup.
Running Playwright tests in an Azure DevOps pipeline
Before configuring reports, we need a baseline pipeline that executes Playwright tests reliably. Azure Pipelines uses YAML files to define build and release workflows, running execution scripts on hosted or self-hosted agents.
To execute Playwright tests in your pipeline, you must configure a Node.js environment, install project dependencies, download the necessary browser binaries, and run the execution command.
Baseline pipeline configuration
The baseline pipeline configuration handles environment setup and runs your test suite on every code push or pull request.
Here is a baseline YAML configuration for Azure Pipelines:
trigger:
- main
pool:
vmImage: 'ubuntu-latest'
steps:
- task: NodeTool@0
inputs:
versionSpec: '20.x'
displayName: 'Install Node.js'
- script: npm ci
displayName: 'Install Dependencies'
- script: npx playwright install --with-deps
displayName: 'Install Playwright Browsers'
- script: npx playwright test
displayName: 'Run Playwright Tests'
env:
CI: 'true'
Installing browsers and dependencies
The npx playwright install --with-deps command is required to run tests on hosted agents. The --with-deps flag instructs Playwright to install the operating system dependencies required by Chromium, Firefox, and WebKit.
Without this flag, hosted Linux agents will fail to launch headless browsers, leading to immediate pipeline timeouts. If you run tests inside a Docker container on self-hosted agents, you can skip browser installation by using the official Playwright Docker image as your container environment.
How Azure DevOps reports Playwright results out of the box
Once your pipeline runs, Azure DevOps gives you two built-in ways to see what happened. You can publish a JUnit XML report to populate the build's Test summary, and you can upload the full Playwright HTML report as a pipeline artifact.
Configuring the JUnit reporter for the tests tab
The native Tests tab in Azure DevOps displays pass, fail, and duration metrics. To feed this tab, Playwright must generate a JUnit XML file during execution.
First, add the JUnit reporter to your playwright.config.ts configuration:
import { defineConfig } from '@playwright/test';
export default defineConfig({
reporter: [
['line'],
['junit', { outputFile: 'results/results.xml' }]
],
});
Next, add the publish task to your azure-pipelines.yml file, ensuring it runs even if previous steps fail:
- task: PublishTestResults@2
inputs:
testResultsFormat: 'JUnit'
testResultsFiles: '**/results.xml'
searchFolder: '$(System.DefaultWorkingDirectory)/results'
mergeTestResults: true
failTaskOnFailedTests: true
condition: succeededOrFailed()
displayName: 'Publish Test Results to Tests Tab'
The condition: succeededOrFailed() setting ensures that Azure DevOps parses and displays test results when tests fail, which is the exact moment you need the reporting tab.
Publishing the Playwright HTML report as a pipeline artifact
JUnit XML files contain test statuses and stack traces, but they lack rich media like screenshots, console logs, and trace recordings. To preserve these debugging assets, you must publish the Playwright HTML report folder.
Add the artifact upload task to your pipeline YAML file:
- task: PublishPipelineArtifact@1
inputs:
targetPath: '$(System.DefaultWorkingDirectory)/playwright-report'
artifact: 'playwright-report'
publishLocation: 'pipeline'
condition: succeededOrFailed()
displayName: 'Publish Playwright HTML Report'
This task compresses the HTML report folder and uploads it as a build artifact associated with the pipeline run.
The local download loop: Visualizing failures
The combination of JUnit XML and HTML artifacts represents the default approach for most teams. However, this workflow introduces significant friction when a test fails.
Because Azure DevOps does not render HTML files or play videos directly in-browser, developers must download the compressed artifact zip file to their local machine. Once downloaded, the developer must extract the folder, run a local web server, or upload the trace file to trace.playwright.dev to inspect the time-travel debug log.
This loop adds minutes of manual context switching to every single test failure, slowing down pull request reviews.
Running Playwright tests in parallel with sharding
As your test suite grows past a few hundred tests, a single pipeline agent will take too long to run. Playwright supports native test sharding, letting you split a suite across parallel agents.
Parallelizing with the matrix strategy
To run sharded tests in Azure DevOps, use the matrix strategy. This splits the execution across a specified number of virtual machines running in parallel.
strategy:
matrix:
shard1:
SHARD_INDEX: 1
SHARD_TOTAL: 3
shard2:
SHARD_INDEX: 2
SHARD_TOTAL: 3
shard3:
SHARD_INDEX: 3
SHARD_TOTAL: 3
maxParallel: 3
steps:
- script: npx playwright test --shard=$(SHARD_INDEX)/$(SHARD_TOTAL)
displayName: 'Run Playwright Shard $(SHARD_INDEX) of $(SHARD_TOTAL)'
Merging reports with the blob reporter
Running sharded tests splits your results. If you publish reports directly from each shard, you will get three separate, incomplete HTML reports.
To create a single consolidated report, configure Playwright's blob reporter in your shards, then add a final pipeline job to merge the blob files.
Configure the blob reporter on the sharded test run:
export default defineConfig({
reporter: [['blob']],
});
Upload the blob files from each shard to pipeline artifacts:
- task: PublishPipelineArtifact@1
inputs:
targetPath: '$(System.DefaultWorkingDirectory)/blob-report'
artifact: 'all-blobs-$(SHARD_INDEX)'
publishLocation: 'pipeline'
displayName: 'Publish Blob Shard'
Add a final pipeline job that runs after the sharded execution completes. This job downloads all blobs and merges them:
jobs:
- job: MergeReports
dependsOn: TestJob
pool:
vmImage: 'ubuntu-latest'
steps:
- task: DownloadPipelineArtifact@2
inputs:
buildType: 'current'
targetPath: '$(Pipeline.Workspace)/all-blobs'
displayName: 'Download All Blobs'
- script: |
npx playwright merge-reports $(Pipeline.Workspace)/all-blobs
displayName: 'Merge Blob Reports'
- task: PublishPipelineArtifact@1
inputs:
targetPath: '$(System.DefaultWorkingDirectory)/playwright-report'
artifact: 'merged-html-report'
publishLocation: 'pipeline'
displayName: 'Publish Merged HTML Report'
Why native Azure DevOps reporting falls short for Playwright
The pipeline runs, the Tests tab shows status, and the HTML report is stored as an artifact. For a small project, this works. For an engineering team shipping daily, three significant gaps emerge.
No cross-build trend visibility
The native Tests tab shows results for the current build. It does not provide historical context.
If a test fails on the main branch, you cannot easily tell if it started failing today, if it fails only on Linux agents, or if its average duration has been rising over the last 30 runs. Identifying patterns requires engineers to manually inspect past pipeline executions.
The flaky test blind spot
Automatic retries are essential to keep CI pipelines passing when network requests or animations lag. However, when a test fails once and passes on retry, Azure DevOps marks the build as green.
The initial failure is logged, but it does not appear in high-level dashboards. These hidden failures slow down execution times and mask underlying code instability until the flakiness worsens and blocks a deployment.
Context switching and debugging friction
The manual steps required to debug a failed test add up quickly. If a developer must triage 10 failures a day, downloading artifacts, extracting archives, and launching local trace viewers wastes hours of productive engineering time every week.
Teams need direct, in-browser access to screenshots, logs, and interactive traces immediately from the build summary.
Detecting and managing flaky Playwright tests in Azure DevOps
A flaky test that passes on retry looks like a healthy test, until it blocks a release. Managing test flakiness is critical to maintain pipeline trust.
Playwright's native retry configuration
To enable automatic retries, configure the retries parameter in your Playwright configuration file.
import { defineConfig } from '@playwright/test';
export default defineConfig({
retries: process.env.CI ? 2 : 0,
use: {
trace: 'on-first-retry',
screenshot: 'only-on-failure',
},
});
In this setup, Playwright will retry failed tests up to 2 times when executing in a continuous integration environment. The configuration also generates traces only during retries to save artifact storage space.
Why retries are a double-edged sword
Retries keep your pipeline green and prevent false alarms. However, they also create a false sense of security.
A test suite containing 10 flaky tests will pass CI, but it will consume 2 to 3 times more machine hours. Without central visibility, developers will not prioritize fixing these tests because the build is passing.
Building a flakiness management strategy
To prevent flaky tests from degrading build speed, you must track them. You need to measure the flaky rate (the percentage of runs where a test required a retry to pass), rank tests by flakiness frequency, and quarantine unstable tests in separate runs until they are resolved.
Native Azure DevOps dashboard widgets do not track these metrics, making it necessary to use dedicated reporting extensions.
Getting real Playwright test reporting inside Azure DevOps
While the native Tests tab and downloaded HTML reports can be sufficient for teams executing just 1 to 3 runs weekly, this manual approach becomes a bottleneck as your testing scales. To resolve the manual download loop and track trends for a larger suite, you can integrate TestDino directly into your project.
The TestDino extension: Stay where you work
The TestDino Azure DevOps extension adds a dedicated TestDino tab to your left navigation sidebar. Instead of downloading zip files or switching platforms, you can view your test runs directly inside the Azure DevOps user interface.
The dashboard displays recent test runs with pass, fail, skipped, and flaky counts, along with metadata, such as execution duration, commit hash, branch, and test environment. You can filter runs by date range or run type, and refresh the dashboard with a single click.
Connecting TestDino in 4 simple Steps
The extension acts as a read-only visualization layer that fetches your test run data securely over HTTPS. Connecting it takes under five minutes:
- Create a TestDino Account: If you do not have a TestDino account, head over to TestDino to create one. Your account will securely host the test results that Azure DevOps displays.

- Install the Extension: Search for TestDino in the Visual Studio Marketplace, install it, and select your Azure DevOps organization. The TestDino tab will appear in your project's left sidebar.


- Generate a Project Token: Log in to TestDino, navigate to your Project Settings, go to API Keys (or Integrations), and generate a new read-only Project Access Token.

- Connect the Tab : Click the TestDino tab in Azure DevOps, paste your generated token, and click Connect.

Once connected, any test results uploaded from your CI pipeline runs will automatically populate the dashboard.
Comparing Playwright test reporting tools for Azure DevOps
The right reporting setup depends on your team's size, engineering workflow, and data retention requirements.
Here is how the main reporting approaches compare head-to-head:
| Approach | Cost | Setup | Dashboard Location | Flaky Detection | Trace Viewer | Best For |
|---|---|---|---|---|---|---|
| Native JUnit | Free | Low | Small projects | |||
| HTML Artifact | Free | Low | Basic debugging | |||
| Allure Extension | Free (OSS) | Medium | Visual dashboards | |||
| Azure Test Plans | Paid | Medium | Enterprise traceability | |||
| MS Playwright Testing | Paid | Medium | Large-scale suites | |||
| TestDino Extension | Free / Paid | Low | Teams needing fast triage |
Wrapping up: Setting up clean Playwright reporting
Running Playwright test suites in Azure DevOps pipelines is straightforward, but verifying and triaging results can quickly become a bottleneck for developers. Native options provide basic pass/fail tracking, but they force developers into a slow cycle of downloading and extracting artifact files to debug failures.
To improve engineering velocity, you should select a reporting setup that surfaces failure details, provides direct trace access, and exposes flaky test trends where you work. To start building a highly visible test reporting workflow, you can review the setup docs or sign up for TestDino's free Community plan to run the marketplace extension on your pipelines.
FAQs

Vishwas Tiwari
Software Engineer

