How to test prototypes with agentic AI

This week we ran a meetup that was secretly a dress rehearsal. Next week we're in Lisbon at Productized, and we wanted to pressure-test the riskiest part of the session with peers in the room first. The riskiest part isn't building. It's the loop that puts a product idea in front of real people and tells you whether it lands. That loop normally eats a week. We run it in an afternoon. Here's how, and where the agent actually does the work.

It starts with what's worth testing. Before anyone builds, you need to know what you're testing and why. We point Claude at a real business (Vinted, in our case) and let it map the business model end to end. It documents the status quo, lays the canvas out on a Miro board so we can see the whole thing, and surfaces the assumptions that would hurt most if they're wrong. We take the riskiest one as the hypothesis. That order matters — David Bland's point is to decide what you're testing before you worry about how. The agent does the reading and the mapping. We decide what's worth betting on.

The Vinted business model and a Strategyzer Test Card, mapped together on a Miro board

The prototype and the Test Card grow together. Working with an agent, these two stop being separate steps. You kick-start the prototype straight from the hypothesis. Into Lovable, where the idea takes shape as you prompt it. Alongside it you fill in the Test Card: what you want to learn, and what would tell you you're onto something. Not a "7 of 10 users" threshold, a direction. The one fixed part is how you'll test, which here is always an unmoderated user test; the prototype idea fills in around it. Lovable gets close enough to a customer's brand that testers forget it's a mockup, and one small move makes it testable: a recording snippet pasted into the prototype's index.html streams every click back to our testing app, with nothing to install on the tester's phone.

A Vinted-branded prototype taking shape in Lovable

Designing the test goes one of two ways, and the Test Card guides this step too. The card that shaped the prototype now tells the test what to find out. In the PoDojo app you do it by hand: open your prototype, turn the Test Card into the tester task and the questions, grab a share link and a QR code. Or you hand the Test Card to Claude. You never touch a testing-app UI to design the survey; the agent reads the card and writes the study guide, the screening, and the questions, then launches the test. If you've got the environment set up, the same agent builds the clickable prototype too. The whole loop stays one agentic flow, from business model to running test, without leaving the chat.

The unmoderated test live in the PoDojo app, with a share QR code and 'recording snippet detected'

Running it is the easy part. People open the test on a laptop, or scan the QR and run it on their phone. At the meetup, groups tested each other's prototypes in breakout rooms. The backend records each session and transcribes it on its own: audio, what's on screen, and every interaction lined up on one timeline. No recruiting pipeline, no research-ops team in the middle.

The test running on a phone: the task brief for the Vinted prototype

The spoken-feedback step of the unmoderated test on mobile

The unmoderated test confirming the session was recorded

Then Claude works through the results. It reads the behavior-tracked transcripts, tells you how many sessions came in and what's in them, and pulls out the few issues that actually showed up. Ask for a report and it writes one, with short video reels cut straight from the sessions. A finding stops being a line on a slide. People believe it more when they hear it from the tester's own mouth.

That's the loop: frame, build, test, run, synthesize — and Claude runs through all of it, not just the read-out at the end. A week of coordination folded into an afternoon, with the agent absorbing the busywork so the judgment stays with us. Plenty broke along the way, which is the point. A dojo is a place to fail safely. The people who showed up didn't watch us demo a finished thing; they built the case with us.

Watch the session

If that sounds like your kind of afternoon, we'd love to have you at the next one.

FAQ

How do you test a prototype with agentic AI?

You run one loop end to end. An agent (we use Claude) maps the business model and surfaces the riskiest assumption. You shape a Test Card and a Lovable prototype around it together, then run an unmoderated user test on desktop or mobile. The same agent designs the test and reads the sessions afterwards.

What is a Test Card?

A Test Card, popularized by David Bland and Strategyzer, turns a risky assumption into a one-page hypothesis: what you believe, how you'll test it, and the signal that you're onto something. It guides both the prototype you build and the test you run. The signal is a direction to follow, not a statistical threshold.

What is an unmoderated user test?

An unmoderated user test lets participants work through a prototype on their own, with no facilitator guiding them. They open a link on a laptop or scan a QR code on their phone, complete the task, and the tool records the session and transcribes audio, screen, and clicks on one shared timeline.

How does AI synthesize user-testing results?

The agent reads the behavior-tracked transcripts instead of you rewatching every recording. It reports how many sessions came in, pulls out the few issues that actually recurred, and writes a report with short video reels cut straight from the sessions, so each finding is backed by the participant's own words.