Activity
Mon
Wed
Fri
Sun
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
What is this?
Less
More

Owned by Matt

4 Keys To Wealth Legacy

130 members โ€ข Free

Learn the 4 Keysโ€”Faith, Leadership, Leverage & Financial Literacyโ€”to build lasting wealth, time freedom, and a legacy for your family.

Memberships

AI & QA Accelerator

624 members โ€ข Free

Nobility Digital AI Scaling

15 members โ€ข $597

Profit Paths

403 members โ€ข Free

The Freedom Builders Circle

1.6k members โ€ข Free

DG Community Builders

145 members โ€ข $97/m

Digital Growth Community

60.7k members โ€ข Free

Clarity-To-Cash

32 members โ€ข $1,000/m

The Creators Community

4.4k members โ€ข Free

Legacy Blueprint NOW

276 members โ€ข Free

1 contribution to AI & QA Accelerator
Mar 19 โ€ขย 
AI&QA
AI Coding Agents for QA: Part 4 โ€” Why the Same Model Gives Different Test Results
In Part 3 I introduced Cursor and why IDE tools beat CLI for QA automation. But before we go deeper into Cursor features, there is a bigger question worth answering. โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ ๐“๐ฐ๐จ ๐„๐ง๐ ๐ข๐ง๐ž๐ž๐ซ๐ฌ. ๐’๐š๐ฆ๐ž ๐Œ๐จ๐๐ž๐ฅ. ๐ƒ๐ข๐Ÿ๐Ÿ๐ž๐ซ๐ž๐ง๐ญ ๐‘๐ž๐ฌ๐ฎ๐ฅ๐ญ๐ฌ. Engineer A asks GPT-5.4 to write a login test. Gets back: a clean, structured test. Uses their proper fixtures. Follows their naming convention. Works on first run. Engineer B does the same thing. Same model. Same task. Gets back: a generic, broken test. Hardcoded credentials. No page objects. Fails immediately. โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ ๐Ÿšซ ๐Œ๐จ๐ฌ๐ญ ๐๐ž๐จ๐ฉ๐ฅ๐ž ๐๐ฅ๐š๐ฆ๐ž ๐ญ๐ก๐ž ๐Œ๐จ๐๐ž๐ฅ "GPT is bad at tests." "GPT doesn't understand Playwright." "I need a better model." That is the wrong diagnosis. The model is not the problem. All modern models can code really well. Three other things determine quality. โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โš™๏ธ ๐‹๐š๐ฒ๐ž๐ซ ๐Ÿ: ๐“๐ก๐ž ๐“๐จ๐จ๐ฅ As covered in Part 1, you never talk to the model directly. โ–บ You โ–บ Tool โ–บ Model The tool decides what to send to the model. What context. What files. What history. Cursor sends your repo structure, open files, and recent edits. A chat app sends nothing. Same model. Different tool. Completely different output. โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ ๐Ÿ“ ๐‹๐š๐ฒ๐ž๐ซ ๐Ÿ: ๐‘๐ž๐ฉ๐จ ๐๐ฎ๐š๐ฅ๐ข๐ญ๐ฒ AI agents amplify whatever already exists in your project. Good framework? The agent writes tests that slot right in. No page objects, no fixtures, no structure? The agent writes whatever it can. Which is usually a mess. This is the hard truth: AI cannot rescue a bad codebase. It makes it worse, faster. The model is only as good as what it can see. If your repo has: โˆ™ Clear fixture files โˆ™ Consistent naming โˆ™ Reusable page objects โˆ™ Good test examples The agent pattern-matches against all of that and writes code that fits. If it sees nothing, it invents everything. Pure lottery. โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ ๐Ÿ“ ๐‹๐š๐ฒ๐ž๐ซ ๐Ÿ‘: ๐“๐ก๐ž ๐“๐š๐ฌ๐ค ๐’๐ฉ๐ž๐œ "Write a login test" is not a task spec. It is a hint.
AI Coding Agents for QA: Part 4 โ€” Why the Same Model Gives Different Test Results
1 like โ€ข Mar 19
Good stuff my friend. This makes a lot of sense and why it works that way!
1-1 of 1
Matt Robbins
1
4points to level up
@matt-robbins-tlyaw
I help Dads get out of their jobs so they can be stay at-home-dads without sacrificing their income.

Active 24d ago
Joined Mar 12, 2026
Utah
Powered by