close
Skip to content

CLI: Add site editor performance benchmark#3408

Merged
brandonpayton merged 17 commits intotrunkfrom
cli-perf-benchmark
Mar 18, 2026
Merged

CLI: Add site editor performance benchmark#3408
brandonpayton merged 17 commits intotrunkfrom
cli-perf-benchmark

Conversation

@brandonpayton
Copy link
Copy Markdown
Member

@brandonpayton brandonpayton commented Mar 17, 2026

Summary

  • Add a Playwright-based site editor performance benchmark for the Playground CLI, adapted from Studio's tools/benchmark-site-editor/
  • Measure 6 metrics: server startup time plus 5 site editor interaction metrics
  • Support both unbuilt-jspi and built CLI modes via --mode flag, with configurable rounds and an optional plugins-loaded variant

How it works

The benchmark spawns a Playground CLI server via Nx targets (npx nx unbuilt-jspi playground-cli or npx nx start playground-cli), then launches headless Chromium to navigate through the WordPress site editor and time each interaction. Results are aggregated using median across rounds and output as both a console table and a JSON artifact.

Metrics measured:

Metric Description
serverStartup Time from spawning the CLI process until the server responds to HTTP requests
siteEditorLoad Time from clicking Appearance → Editor until blocks render in the editor canvas
templatesViewLoad Time to open the Templates view and load the template grid
templateOpen Time to open a specific template in the editor
blockAdd Time to insert Paragraph + Heading blocks via the Block Inserter
templateSave Time to save the template until "Saved" button appears

Usage:

npx nx perf playground-cli
npx nx perf playground-cli -- --mode=built --rounds=5 --with-plugins
npx nx perf playground-cli -- --headed --rounds=1

Test plan

  • Run npx nx perf playground-cli -- --rounds=1 and verify it starts the CLI, runs measurements, prints a results table (including serverStartup), and saves a JSON artifact
  • Run with --mode=built and verify it builds first, then benchmarks the built CLI
  • Run with --with-plugins and verify both bare and with-plugins environments are benchmarked
  • Run with --headed and verify Chromium launches visibly for debugging
  • Verify npx nx lint playground-cli passes (no no-console errors from perf files)

🤖 Generated with Claude Code

Add a Playwright-based performance benchmark for the Playground CLI
that measures 5 site editor metrics (siteEditorLoad, templatesViewLoad,
templateOpen, blockAdd, templateSave). Adapted from Automattic/studio's
tools/benchmark-site-editor.

Run via: npx nx perf playground-cli

Supports --mode=unbuilt-jspi (default) and --mode=built to test against
different CLI targets, --with-plugins for a plugins-loaded variant,
and --rounds=N for statistical reliability via median aggregation.
Measure time from spawning the CLI process until the server
responds to HTTP requests. Reported alongside the per-round
site editor metrics.
Remove section separator comments carried over from the Studio
source. Remove --skip-browser from CLI args since the server
command doesn't support it (only the start command opens a browser).
Consume fetch response body in waitForServer to prevent undici's
per-origin connection pool from being exhausted, which caused the
server readiness check to hang even when the server was up.

Kill processes by port in stopServer as a fallback, since the actual
CLI server is a grandchild of npx and may not share the process group.
The Express server accepts TCP connections before WordPress finishes
booting in WASM, causing fetch() to hang indefinitely waiting for
response headers. Add AbortSignal.timeout(10s) to each individual
request so the retry loop can make progress.
Node.js fetch follows redirects by default. The Playground CLI server
redirects / to / for auto-login, creating an infinite redirect loop
that causes fetch to fail. Use redirect: 'manual' to see the 302
directly and treat it as server-ready.
Failed rounds are retried up to 2x the requested round count. If not
enough successful rounds are collected, the script exits with a
non-zero status. Previously, partial failures were silently ignored
and included in the results.
The iframe element can be visible in the DOM before Playwright
registers it in its internal frame list. Poll page.frame() for up to
30s instead of checking once, which eliminates the intermittent
'Editor canvas frame not found' failures in headed mode.
Use the same node command as the unbuilt-jspi Nx target but invoke
it directly with process.execPath. This avoids cross-platform issues
with npx on Windows and removes the npx/nx startup overhead from the
server startup measurement.
@brandonpayton brandonpayton marked this pull request as ready for review March 18, 2026 00:33
@brandonpayton brandonpayton requested review from a team, Copilot and mho22 March 18, 2026 00:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Playwright-driven performance benchmark workflow for the Playground CLI that measures site editor interaction timings and persists results as JSON artifacts.

Changes:

  • Introduces a new nx perf playground-cli target and a root npm script to run the benchmark.
  • Adds a Playwright measurement harness for 5 site editor interaction metrics, plus server startup timing.
  • Adds a “with plugins” blueprint and artifacts output ignore rules for repeatable runs.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
packages/playground/cli/project.json Adds an Nx perf target to run the benchmark entrypoint.
packages/playground/cli/perf/benchmark.ts Implements server spawn/teardown, rounds/median aggregation, and JSON/table reporting.
packages/playground/cli/perf/measure-site-editor.ts Implements Playwright steps for site editor navigation and metric collection.
packages/playground/cli/perf/plugins-blueprint.json Adds a plugin-heavy blueprint variant for benchmarking.
packages/playground/cli/perf/artifacts/.gitignore Ensures generated benchmark artifacts aren’t committed.
packages/playground/cli/.eslintrc.json Disables no-console for perf scripts.
package.json Adds a convenience npm script to run the benchmark.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

brandonpayton and others added 5 commits March 17, 2026 20:45
Instead of specifying a port and polling for server readiness, parse
the server URL from the CLI's 'Ready\! WordPress is running on ...'
output. This avoids port conflicts and removes the killProcessesOnPort
cleanup that could kill unrelated processes.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The previous code searched for any button with 'Saved' in its text,
which could match unrelated UI elements or never match at all (the
current WordPress site editor keeps the text as 'Save' and sets
aria-disabled=true when the save completes). Target the exact Save
button and wait for it to become disabled.
@brandonpayton
Copy link
Copy Markdown
Member Author

This is pretty separate from everything else. It works when I run it locally, and all the existing tests pass.

Now we can use this for better evaluation of specific Windows performance improvements. If we want, we could also post performance output for every PR that touches a Playground CLI dependency.

Let's merge once the tests pass again after a small update.

@brandonpayton brandonpayton merged commit 4f8b071 into trunk Mar 18, 2026
46 checks passed
@brandonpayton brandonpayton deleted the cli-perf-benchmark branch March 18, 2026 02:23
@brandonpayton
Copy link
Copy Markdown
Member Author

note: I meant to commit with a [CLI] commit message prefix, but committed with a CLI: prefix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants