Here is the full workflow that turns Playwright CLI + an AI coding agent into a system that can write UI tests for you. It works with Cursor, Claude Code, Codex, or any agent you have.
────────────────────────────────────────
🟢 𝐓𝐡𝐞 𝐔𝐧𝐢𝐯𝐞𝐫𝐬𝐚𝐥 𝐖𝐨𝐫𝐤𝐟𝐥𝐨𝐰
➤ Step 1. Explore the test case with Playwright CLI
You give the agent a real test case with the steps and validations:
∙ What user flow to cover
∙ What URL to start from
∙ What success looks like
∙ What data or credentials to use (or where to find them in the repo)
The agent uses Playwright CLI to walk through that flow in the browser:
∙ `open` the starting page
∙ `snapshot` to read what is on screen
∙ `click`, `fill`, and navigate step by step
∙ `snapshot` again after each meaningful action
The Agent is exploring the app the same way a human tester would, but faster, and with structured output, a.k.a it generates a file with it's findings.
➤ Step 2. You review.
When exploration finishes, the Agent should produce a short exploration document. Something you can read and review.
It should include:
∙ Pages visited and the order of steps
∙ Locators or element refs that worked
∙ Form fields, buttons, and links involved
∙ Assertions the Agent observed (visible text, URL changes, success messages)
∙ Anything ambiguous or blocked (login wall, captcha, missing test data)
𝐘𝐨𝐮 𝐫𝐞𝐯𝐢𝐞𝐰 𝐭𝐡𝐢𝐬 𝐛𝐞𝐟𝐨𝐫𝐞 𝐚𝐧𝐲 𝐭𝐞𝐬𝐭 𝐜𝐨𝐝𝐞 𝐢𝐬 𝐰𝐫𝐢𝐭𝐭𝐞𝐧.
➤ Step 3. The Agent generates UI tests
Now the agent writes code. It uses two inputs:
1. The exploration document from Step 2
2. Your existing test framework. Folder structure, page objects, fixtures, naming conventions, helper methods
Playwright CLI does not replace your framework. It feeds facts into it.
Same workflow whether you use Playwright Test, Selenium, Cypress, or something else. The exploration layer is shared. The test code layer is yours.
➤ Step 4. The Agent runs the tests.
The agent runs the new test (or the relevant suite).
If it passes —> done.
If it fails —> the agent goes back to Playwright CLI:
∙ Open the same page in headed mode
∙ Reproduce the failing step
∙ Snapshot the current state
∙ Compare what the test expected vs what is actually on screen
Then it updates the test and runs again.
This loop repeats until the test passes reliably or until the agent reports a real blocker it cannot resolve alone (missing test data, environment issue, product bug).
────────────────────────────────────────
🔄 𝐇𝐨𝐰 𝐭𝐡𝐞 𝐋𝐨𝐨𝐩 𝐋𝐨𝐨𝐤𝐬 𝐢𝐧 𝐏𝐫𝐚𝐜𝐭𝐢𝐜𝐞
```
Test case (you)
↓
Playwright CLI exploration (agent)
↓
Exploration doc (agent writes → you review)
↓
UI test code in your framework (agent)
↓
Run test (agent)
↓
Pass? → Done
Fail? → Playwright CLI debug → fix test → run again
```
Do not ask the agent to explore, write, run, and debug ten flows in a single prompt.
Break the work up.
One flow at a time.
────────────────────────────────────────
📝 𝐖𝐡𝐚𝐭 𝐘𝐨𝐮 𝐒𝐡𝐨𝐮𝐥𝐝 𝐆𝐢𝐯𝐞 𝐭𝐡𝐞 𝐀𝐠𝐞𝐧𝐭 𝐔𝐩 𝐅𝐫𝐨𝐧𝐭
The quality of the output depends on what you put in at Step 1.
Weak input:
```
write a login test
```
Strong input:
```
Explore the login flow for our staging app.
Use credentials <X> and <Y>
Flow to verify:
1. Valid user can log in
2. After login, user lands on the dashboard
3. Dashboard shows the user's name in the header
Use Playwright CLI to explore the flow first.
Save findings to docs/exploration/login-flow.md before writing any test code.
```
────────────────────────────────────────
📌 Want to run this workflow live on your own repo?
Join the AI AutoTest Live Workshop — hands-on practice with AI coding agents, Playwright CLI, task specs, and real test automation workflows.