Grouping Playwright Tests: test.describe, Tags & Grep Guide

Group Playwright tests using test.describe, tags, and grep to speed up CI, run focused smoke and regression suites, and keep reports clean and actionable for your team.

Thumbnail 4

Playwright gives you a way to bundle tests together under a name, stick labels on them, and then tell the runner "only run the ones with this label."

The grouping shows up in your test reports, so when something fails, you immediately know which part of the app it belongs to.

The trouble is, most projects skip this step early on. Tests pile up in flat files, every CI run triggers every test, and a single failure in a 200-test report becomes a needle in a haystack. That is the exact point where teams start losing trust in their test suite.

This guide walks through how to group Playwright tests using test.describe, tags, and the --grep flag, with examples you can drop into a real project today.

What is test.describe and How does it group tests?

test.describe is Playwright's API for wrapping related tests inside a named block. It takes a title and a callback, and every test declared inside that callback belongs to the group.

When you have hundreds of tests across dozens of files, seeing the group name right next to the test name tells you exactly which part of the app broke, without even opening the file.

The title you give the block becomes a prefix in the test report. So a test called 'should show error' inside test.describe('Login Page', ...) shows up as Login Page › should show error in your results.

But naming is only the beginning. Grouping Playwright tests also unlocks three other capabilities that matter as your project grows:

  • Scoped hooks.You can attach beforeEach, afterEach, beforeAll, and afterAll hooks to a specific group, so setup logic only runs for the tests that actually need it.
  • Shared configuration. Using test.use() inside a describe block lets you apply browser settings, viewport sizes, or authentication state to just that group of tests.
  • Hierarchical structure. Nesting describe blocks inside each other creates a tree that mirrors your application's feature areas, making it obvious where a failure lives.

But naming is only the beginning. Grouping Playwright tests also unlocks three other capabilities that matter as your project grows:

Scoped hooks. You can attach beforeEach, afterEach, beforeAll, and afterAll hooks to a specific group, so setup logic only runs for the tests that actually need it.

Shared configuration. Using test.use() inside a describe block lets you apply browser settings, viewport sizes, or authentication state to just that group of tests.

Hierarchical structure. Nesting describe blocks inside each other creates a tree that mirrors your application's feature areas, making it obvious where a failure lives.

Basic syntax and usage

login.spec.ts
import { testexpect } from "@playwright/test";
test.describe("Login Page"() => {
  test("should display the login form"async ({ page }) => {
    await page.goto("/login");
    await expect(page.locator("#login-form")).toBeVisible();
  });
  test("should reject invalid credentials"async ({ page }) => {
    await page.goto("/login");
    await page.fill("#email""[email protected]");
    await page.fill("#password""badpass");
    await page.click("#submit");
    await expect(page.locator(".error-message")).toContainText("Invalid");
  });
  test("should redirect on successful login"async ({ page }) => {
    await page.goto("/login");
    await page.fill("#email""[email protected]");
    await page.fill("#password""correctpassword");
    await page.click("#submit");
    await expect(page).toHaveURL("/dashboard");
  });
});

This is the foundation of test suite organization in Playwright. If you are already using Playwright assertions with expect() in your tests, wrapping them inside describe blocks is the natural next step. Every test lives under a clear label, and the report reflects that structure automatically.

Playwright also supports an anonymous form where you skip the title entirely. This is handy when you want to apply a shared option (like dark mode or a specific viewport) to a set of tests without adding a label to the report:

dark-mode.spec.ts
test.describe(() => {
  test.use({ colorScheme"dark" });
  test("homepage in dark mode"async ({ page }) => {
    await page.goto("/");
    await expect(page.locator("body")).toHaveCSS(
      "background-color",
      "rgb(0, 0, 0)",
    );
  });
});

Nested test.describe blocks for complex suites

You can nest test.describe blocks inside each other. An outer block for the feature, inner blocks for specific user scenarios. Each level gets its own hooks and its own prefix in the report.

checkout.spec.ts
test.describe('Checkout Flow'() => {
  test.describe('Path Demo'() => {
    test('should load home page'async ({ page }, testInfo) => {
      await openHome(page);
      await page.screenshot({
        pathtestInfo.outputPath(screenshotName(testInfo)),
        fullPagetrue,
      });
    });
  });

checkout-flow

checkout-flow

The report tree looks like: Checkout Flow › Path Demo › should load home page.

Tip: Stick to a maximum of three nesting levels. Going deeper usually means the file itself should be split into separate spec files.

One common mistake with nested describe blocks is assuming hooks cascade automatically. They do run in order (outer first, then inner), but each level's beforeEach only applies to its own tests. If a setup step is needed everywhere, put it at the file level.

Applying beforeEach and afterAll hooks inside describe blocks

When you declare beforeEach inside a describe block, it only runs for tests in that block. A beforeEach at the file level runs for every test in the file. This distinction matters because it directly controls what setup each test group gets.

dashboard.spec.ts
import { testexpect } from "@playwright/test";
// File-level: runs before EVERY test in this file
test.beforeEach(async ({ page }) => {
  await page.goto("/");
});
test.describe("Dashboard"() => {
  // Describe-level: runs only before Dashboard tests
  test.beforeEach(async ({ page }) => {
    await page.click("#nav-dashboard");
  });
  test("should show stats"async ({ page }) => {
    await expect(page.locator("#stats")).toBeVisible();
  });
  test.afterAll(async () => {
    console.log("Dashboard tests done");
  });
});

The execution order for a Dashboard test goes like this:

File-level beforeEach (navigates to /)

Describe-level beforeEach (clicks dashboard nav)

The test itself

Describe-level afterAll (runs once after all Dashboard tests finish)

Note: You can use test.describe.configure({{ mode: 'serial' }}) to force tests inside a block to run one after another instead of in parallel. This is useful for multi-step flows (like onboarding wizards) where each test depends on the previous one completing. Use it sparingly though, because parallel execution is faster and catches more isolation bugs.

Grouping tests with modifiers: test.describe.only vs test.describe.skip

Sometimes you need to temporarily change which groups run. Playwright gives you two modifiers for this.

test.describe.only() tells Playwright to run only this group and skip everything else. It is the go-to for local debugging when you want to isolate a failing group without touching any other code. Pair it with the Playwright Trace Viewer to get a full visual timeline of what happened during the test run.

code
test.describe.only("Payment Processing"() => {
  test("should process valid card"async ({ page }) => {
    // This WILL run
  });
});
test.describe("User Profile"() => {
  test("should update name"async ({ page }) => {
    // This will NOT run
  });
});

test.describe.skip() does the opposite. It marks the entire group as skipped. Tests show up in the report as "skipped" but never execute. Use this for features that are blocked or under active development.

code
test.describe.skip("Experimental Feature"() => {
  test("should render new widget"async ({ page }) => {
    // Skipped
  });
});

There is also test.describe.fixme() which works like skip but carries a stronger signal: "these are broken and need fixing."

Modifier What it Does When to Use
test.describe.only() Runs only this group, skips all others Local debugging, isolating failures
test.describe.skip() Skips the group, runs everything else Blocked features, WIP code
test.describe.fixme() Marks group as broken, does not run Known failures needing attention

Tip: Never commit .only() to your main branch. Use npx playwright test --forbid-only in CI to make the pipeline fail if a stray .only() slips through. This one flag has saved more teams from silent test coverage loss than any linter rule.

Using tags and grep to filter tests

test.describe groups tests inside a file. Tags and grep work across your entire project. They let you label tests by purpose and then run exactly the subset you need from the command line.

This is the mechanism behind any serious smoke and regression testing strategy. It also helps isolate the root causes when Playwright tests fail in CI.

How to add tags to your tests

There are two ways. The recommended approach (available since Playwright v1.42) uses the tag option in a details object. The older way embeds the tag directly in the test title.

Tags are string labels prefixed with @ that categorize tests by purpose (like @smoke, @regression, @flaky). Unlike describe blocks, tags create cross-cutting categories across multiple files.

code
// Method 1: tag option (recommended, available since v1.42)
test(
  "should load homepage",
  {
    tag"@smoke",
  },
  async ({ page }) => {
    await page.goto("/");
    await expect(page).toHaveTitle(/Home/);
  },
);
// Method 2: inline in the title
test("should load homepage @smoke"async ({ page }) => {
  // works, but the tag is now part of the test name in reports
});

You can also tag a whole describe block. Every test inside inherits the tag:

code
test.describe(
  "Critical Paths",
  {
    tag: ["@smoke""@critical"],
  },
  () => {
    test("login works"async ({ page }) => {
      /* inherits @smoke, @critical */
    });
    test(
      "checkout works",
      {
        tag"@e2e",
      },
      async ({ page }) => {
        /* has @smoke, @critical, AND @e2e */
      },
    );
  },
);

A common gotcha: forgetting the @ prefix. Tags without it will not be picked up by --grep the way you expect. Always start with @.

Running tests by tag using the --grep flag

Once tests are tagged, --grep filters what runs:

terminal
# Only smoke tests
npx playwright test --grep "@smoke"
# Tests with either tag (OR)
npx playwright test --grep "@smoke|@regression"
# Tests with BOTH tags (AND, using regex lookaheads)
npx playwright test --grep "(?=.*@smoke)(?=.*@critical)"
# Target a specific describe group by name
npx playwright test --grep "Login Page"

That last one is worth noting. --grep matches against the full test title, which includes the describe block name. So you can filter by group name, tag, or even part of a test name.

Ship Quality Code, Faster
Automate your Playwright tests with zero infrastructure.
Start Free CTA Graphic

Using --grep-invert to exclude specific test groups

--grep-invert is the opposite of --grep. It skips everything that matches and runs the rest.

terminal
# Run everything except slow tests
npx playwright test --grep-invert "@slow"
# Combine both for precision
npx playwright test --grep "@smoke" --grep-invert "@flaky"

That last command runs all smoke tests except the flaky ones. In CI, this is how you get fast, reliable feedback without being blocked by known intermittent failures.

You can also bake these filters directly into playwright.config.ts so you do not have to type long CLI flags every time:

playwright.config.ts
import { defineConfig } from "@playwright/test";
export default defineConfig({
  projects: [
    {
      name"smoke",
      grep/@smoke/,
      retries0,
    },
    {
      name"regression",
      grep/@regression/,
      retries2,
    },
    {
      name"all-except-flaky",
      grepInvert/@flaky/,
    },
  ],
});

Now running npx playwright test --project=smoke does exactly what you'd expect. Clean, no extra flags, and the strategy is self-documented in your config.

You can also use testMatch and testIgnore in project config to filter by file path patterns instead of tags. For example, testMatch: /.*smoke.spec.ts/ runs only files with "smoke" in the name. This pairs well with tag-based grep when you want both file-level and test-level filtering.

time-organization-startegy

Test Suite Execution Time by Organization Strategy

Note: The grep property in the config uses regex literals (/@smoke/), not strings. On the CLI, --grep accepts a string that Playwright internally converts to regex.

Best practices for large projects

Once your project grows past a few dozen spec files, conventions become critical. Good Playwright test management starts with folder structure and naming rules that the entire team follows. Here is what consistently works at scale.

Folder structure that mirrors your app:

code
tests/
├── auth/
│   ├── login.spec.ts
│   ├── signup.spec.ts
│   └── password-reset.spec.ts
├── dashboard/
│   ├── widgets.spec.ts
│   └── analytics.spec.ts
├── checkout/
│   ├── cart.spec.ts
│   └── payment.spec.ts
└── playwright.config.ts

This gives you three practical benefits:

  • Parallel execution by default. Playwright parallelizes at the file level, so more files means better worker utilization.
  • Clear ownership. You can assign CODEOWNERS per folder, so the right team gets notified when their tests fail.
  • Selective CI runs. Changed files in auth/ can trigger only the auth tests via path-based filtering.

Naming conventions the team agrees on:

Convention Example Why
Describe titles = feature names 'Checkout Flow' Maps tests to product areas in reports
Tags = lowercase with @ @smoke, @regression Predictable grep patterns
Test names start with "should" 'should reject expired card' Makes intent immediately clear
One spec file per feature payment.spec.ts Better parallelization and ownership

Tip: Create a TAGS.md or a constants file at the root of your test directory that documents every tag, its meaning, and when to apply it. Without this, teams inevitably end up with duplicate tags like @smoke, @Smoke, and @critical-smoke all meaning the same thing.

CI/CD integration

This is where grouping Playwright tests actually pays off. Instead of running everything on every push, you run different groups at different stages. Quick smoke tests on PRs. Full regression on merge. Slow E2E on a nightly schedule.

GitHub actions example

.github/workflows/playwright.yml
namePlaywright Tests
on:
  pull_request:
    branches: [main]
  push:
    branches: [main]
  schedule:
    cron"0 2 * * *"
jobs:
  smoke-tests:
    nameSmoke Tests
    runs-onubuntu-latest
    ifgithub.event_name == 'pull_request'
    steps:
      usesactions/checkout@v4
      usesactions/setup-node@v4
        with:
          node-version20
          cache"npm"
      runnpm ci
      runnpx playwright install --with-deps chromium
      runnpx playwright test --grep "@smoke" --project=chromium
      usesactions/upload-artifact@v4
        iffailure()
        with:
          namesmoke-report
          pathplaywright-report/
  regression-tests:
    nameFull Regression
    runs-onubuntu-latest
    ifgithub.event_name == 'push' && github.ref == 'refs/heads/main'
    strategy:
      matrix:
        shard: [1/32/33/3]
    steps:
      usesactions/checkout@v4
      usesactions/setup-node@v4
        with:
          node-version20
          cache"npm"
      runnpm ci
      runnpx playwright install --with-deps
      runnpx playwright test --grep "@regression" --shard=${{ matrix.shard }} --forbid-only

A few things to notice:

Smoke tests install only Chromium to keep the step fast

Regression uses Playwright sharding to split across 3 parallel jobs

--forbid-only prevents accidental .only() calls from limiting coverage in CI

This setup can bring PR feedback from 15+ minutes down to under 3. You can also add --last-failed in a follow-up job to re-run only the tests that failed, which is useful for catching flaky tests without re-running the entire suite.

Run Tests in the Cloud
Parallel Playwright execution with built-in reporting.
Try Free CTA Graphic

TestDino takes this further by providing a cloud execution environment where grouped and tagged Playwright tests run with parallel workers out of the box. It integrates with the best Playwright CI/CD pipelines and gives you dashboards that visualize results by tag and group.

Playwright vs Jest vs Cypress

If you are coming from Jest or Cypress, the describe pattern will feel familiar. The real differences are in tagging, filtering, and parallel execution.

Feature Playwright Jest Cypress
Describe syntax test.describe() describe() describe()
Native tagging Built-in tag option (v1.42+) Needs plugins Limited
CLI grep + invert --grep + --grep-invert -t (title only) --spec (file only)
Config-level grep testConfig.grep Not Available Not Available
Free parallel execution Workers + sharding By file Paid (Cypress Cloud)
Serial mode control describe.configure() Always sequential in file Always sequential

Coming from Jest? Replace describe() with test.describe() and it() with test(). The big upgrade is native tagging and --grep-invert.

Coming from Cypress? The biggest wins are free parallel execution and the ability to run "all smoke tests across all files" from the CLI, which Cypress cannot do without workarounds or paid cloud features.

test-filtering-and-grouping-capabilities

Overall Test Filtering & Grouping Capabilities
Scale Your Test Suite Today
Cloud-powered test execution with smart tag filtering.
Get Started CTA Graphic

Conclusion

Grouping Playwright tests is the foundation that makes everything else work: fast CI pipelines, clean reports, targeted debugging, and team-wide clarity on what tests cover which features.

Here is the sequence that works:

Start with test.describe to give tests a logical structure and scope your hooks to the groups that need them.

Add tags like @smoke, @regression, and @flaky to create cross-cutting categories that span across files.

Wire --grep and --grep-invert into your CI workflow so PRs get fast smoke feedback and merges trigger full regression coverage.

Set up config-level projects with grep/grepInvert to make the strategy permanent and self-documenting.

Each layer builds on the previous one. Skip any of them and the suite eventually becomes the slow, unreliable bottleneck that nobody trusts.

FAQs

What is test.describe used for in Playwright?
test.describe is an API used to logically group related test cases together. It helps organize your files, lets you apply shared setup hooks (like beforeEach), and makes your test results much easier to read. The group title becomes a prefix in the test report, giving you immediate context on which feature area a test belongs to.
How do you run only a specific group of tests in Playwright?
You can run a specific group directly in your code using test.describe.only(). Alternatively, you can use the command line with the --grep flag and a matching pattern (for example, npx playwright test --grep "@smoke"). This lets you run specific groups without altering your code.
What are Playwright test tags and how do they work?
Tags are custom string labels (like @smoke or @regression) that you add to your test titles or via the tag option (available since v1.42). When combined with the --grep command, they allow you to filter and run only tests with those specific tags, which is incredibly useful for CI/CD pipelines.
Can you nest test.describe blocks in Playwright?
Yes, Playwright fully supports nesting them. This lets you create a hierarchy where an outer block represents a feature and inner blocks represent specific scenarios. Each block level can have its own dedicated setup and teardown hooks.
What is the difference between --grep and --grep-invert in Playwright?
The --grep flag runs only the tests that match your provided pattern or tag. The --grep-invert flag does the exact opposite: it skips the tests that match your pattern and runs everything else. Using them together gives you total control over what runs. For example, npx playwright test --grep "@smoke" --grep-invert "@flaky" runs all smoke tests except the flaky ones.
Dhruv Rai

Product & Growth Engineer

Dhruv Rai is a Product and Growth Engineer at TestDino, focusing on developer automation and product workflows. His work involves building solutions around Playwright, CI/CD, and developer tooling to improve release reliability.

He contributes through technical content and product initiatives that help engineering teams adopt modern testing practices and make informed tooling decisions.

Get started fast

Step-by-step guides, real-world examples, and proven strategies to maximize your test reporting success