Heat-testing for new product launches

Why heat-testing produces better data than traditional research at every stage of a product launch.

Key Takeaways

Validation can begin at the concept stage, long before you have a prototype or MVP.
The Say Vs. Do Gap means opinion-based research consistently fails to predict real purchase decisions.
Structured attribute-level testing surfaces audiences and use cases your assumed persona would never find.

The Prototype Trap: why companies wait too long to validate

A common misconception we see at Spark No. 9 is that you need a fleshed-out product before validating market demand. Validation can begin at the concept stage, before significant development investment has been made. That's a valuable time to do it, and it can actively guide product development.

At the concept stage, you're not testing whether your product is good. You're testing the idea itself — whether it resonates with a real audience, whether your framing lands, and whether the people you have in mind actually respond when something is put in front of them. Those are questions you can answer long before you have an MVP, and the data you get back is decision-grade evidence, not directional opinion.

If you're waiting until you're close to launch to find out whether anyone wants what you've built, you've waited too long.

Validation is not a one-time event

The second mistake is treating validation as a pre-launch checkbox, something you do once and move on from. In categories where trend cycles are short and competition moves fast, a positioning strategy that tested well months before launch can be stale by the time you execute it.

The companies that navigate this well treat validation as an ongoing practice rather than a project with a finish line. They test before launch to establish a foundation. They test after launch to find pockets of demand they haven't reached yet. And they test in between when something isn't working and they're not sure why.

Why survey and focus group data fails to predict real behavior

There is a structural reason traditional research methods underperform: they ask people to predict their own behavior. That prediction happens in an artificial setting—a survey form, a focus group room, a moderated interview—far removed from the conditions under which people actually make decisions.

This is the Say Vs. Do Gap: the difference between what people say they'll do in a research setting and what they actually do in the real world. The Say vs. Do Gap is also known as stated vs. revealed preference. The further your validation environment is from real-world conditions, the wider the Say vs. Do gap becomes, and the less reliable your data is as a predictor of launch performance.

Behavioral validation closes that gap by staging research in a real-world environment. When someone scrolling their LinkedIn or Instagram feed encounters an ad and clicks on it, they're not in a research setting. They're going about their day, making a real choice with no moderator in the room and no awareness that they're part of a test. That's what makes the data reliable.

Test attributes, not just audiences: how to find customers you weren't looking for

The most useful reframe in validation is shifting from "who is my customer" to "what does my product actually offer, and who might value that for reasons I haven't considered." Breaking a product down to its individual attributes, its function, its form, its packaging, its positioning, opens up use cases and audiences that a fixed target persona would never surface.

We tested three branding strategies for a supplement brand: a bold orange assumed to have broad appeal, a dark navy with gold lettering for a luxury feel, and a red-and-pink stripe with an energetic, youthful edge. Orange was considered the safest bet going in, but the data said otherwise. Red and pink generated the strongest response among men, driving significantly higher click-through rates and sign-ups from male buyers than any other combination tested.

That's not a decision the client would have made on instinct. The assumption was that red and pink would alienate male buyers, and without data to challenge that, it never would have made it into the launch strategy.

Broad testing produces a structured exploration of your product's dimensions and an honest look at who actually responds when those dimensions are put in front of real people. Your target audience is a starting point, not the whole picture, and the rest of the picture tends to show up in places you weren't looking.

A note on heat-testing

Heat-testing is a structured methodology, developed by Spark No. 9 and published in the Harvard Business Review, that runs a matrix of real campaigns across ad platforms such as Meta, LinkedIn, and TikTok, pairing audience segments against messaging variations with equal budget allocated to each pair for a true apples-to-apples comparison.

The resulting heat map shows where demand is strongest, which audiences responded, and what language, creative, and other factors moved them. Because it puts real strategies in front of real people in real environments, it produces behavioral data you can base decisions on — not stated preference collected in a room.

A full explanation of the methodology is on the Spark No. 9 heat-testing framework page.