I need to work out what's actually wrong before I can fix it.
“I need to work out what's actually wrong before I can fix it.”
The feelingStuck.
If that’s where you are right now, this is the Playbook built for exactly that moment.
“Diagnose first” is one of 40+ What’s Next? Playbooks, for leaders facing a specific, real situation. In under fifteen minutes it helps you recognise what’s actually going on, then gives you a clear way through: the Play to choose, the Plan in concrete moves, the Precedents of people who faced it before, and your next move.
Frameworks you’ll see put to work on this exact decision, applied, not taught in the abstract:
- Design Sprints
- Design Thinking
- Cynefin Framework
You’ll also see how it played out in the real world, Stewart Butterfield at Tiny Speck, Vancouver (2013), and Rachel Tipograph at Gap, San Francisco (2012). Real precedents, not platitudes.
It leaves you with one question to carry into your next conversation: “Which problem on your roadmap is your team treating as Complicated - just throw more analysis at it”
Part of the Discovery & Understanding collection, Playbooks for when you don’t yet understand the problem, the customer, or what to build. See them all ›
Transcript — read it in full
What to do when you have to diagnose a problem before you can fix it
Vancouver, late two thousand and thirteen. Stewart Butterfield is two years into running Tiny Speck, a multiplayer gaming company building a browser-based game called Glitch.
By every operational measure that matters to the business, Glitch is failing. The player base is thin. The monetisation is fragile. The retention curve is the wrong shape. The team has tried what good teams try when a game isn't working — feature additions, marketing pushes, retention experiments. None of it is moving the curve.
The diagnostic Butterfield is running, late in two thousand and thirteen, is the why is this not working? one. He is doing what experienced operators do at this stage. Looking at the data. Talking to the team. Reading the support tickets. Trying to assemble, from the available evidence, a coherent picture of what the game is failing to deliver to its players.
What he notices, in the process, is something the diagnostic was not aimed at.
The internal communication tool his distributed team had hacked together to coordinate Glitch's development — the thing he had been treating as ambient infrastructure, the back-of-house tooling that makes the game-building possible — is unusually good. It is the thing that is making his own team productive, even as the product they are working on is failing.
The decision that follows is not the obvious one.
Most teams who notice that the back-of-house tool is good would treat that as a happy accident. They would keep building the game. They would maybe spin out the tool as a side project. They would not kill the company they are running and rebuild it around the thing they had been building incidentally.
Butterfield does. The game is shut down. The team rebuilds the company around the internal tool. That tool becomes Slack.
The decision is only possible because Butterfield is trying to diagnose a complex problem — why is this game failing? — and notices that the answer is hiding inside the diagnostic process itself. The tool we are using to diagnose is more valuable than the thing we are diagnosing.
That is not a common move. But the general pattern — the real problem turning out to be something you were treating as context while you focused on something else — is common enough that it is worth watching for every time a complex problem refuses to submit to analysis.
So let's go to the office and work through it.
Start by deciding whether the problem is complicated or complex
"I need to work out what's actually wrong before I can fix it."
The feeling is stuck.
The team has been working on this for weeks. Every analysis has produced a plausible-looking answer that the next analysis has unseated. The problem refuses to resolve. And the longer the work continues, the harder it is to admit that the diagnostic itself may be the wrong shape for the problem.
Two choices. Same stuck team. Different fundamental shape of problem.
When the answer is hidden in the detail
Choice one: the problem is complicated. Cause and effect exist but are obscured by complexity. The answer is in the details if you can see them clearly enough. The right move is structural analysis — careful, thorough, unhurried.
If that's the read, map the failing process on a wall and force every department to physically walk up to it and point at where they think the breakdown is.
Donella Meadows, working on systems thinking, was clear that different parts of the system see different parts of the breakdown. The point isn't consensus; it's that the divergence between the fingerprints on the wall is where the real information lives. People rarely disagree about where a problem is unless they're each looking at a real piece of it, and the map of disagreement is the map of the actual problem.
When the system refuses to be analysed
Choice two: the problem is complex. Cause and effect aren't predictable. Any single diagnostic move is as likely to tell you something misleading as something true. Analysis isn't going to crack it, because the system isn't analysable in the way the team has been trying to analyse it.
If that's the read, run three small, safe-to-fail experiments in genuinely different directions rather than committing to one big analytical effort.
Dave Snowden, who developed the framework distinguishing complex from complicated, is clear that complex systems don't submit to analysis the way complicated ones do. The experiments aren't meant to solve the problem. They're meant to teach you something about how the system responds. Three cheap probes beat one expensive diagnosis when the territory is genuinely unfamiliar.
Complicated, or complex. Same stuck team. Different first move.
How to classify the problem before you act
Three tools. The discipline is to classify the domain before you pick the tools.
Sort the problem into the right kind of domain
The first is
the Cynefin Framework.
Cynefin — pronounced kuh-NEV-in, a Welsh word meaning habitat — was developed by Dave Snowden through the early two-thousands while at IBM Global Services and later at the Cognitive Edge consultancy. The framework formalised in his two thousand and seven Harvard Business Review article A Leader's Framework for Decision Making. The deeper roots run through complexity theory, Karl Weick's sense-making writing, and the anthro-complexity tradition Snowden has continued to develop since.
The reason Cynefin matters when you don't understand the problem is that it forces classification before action. Most teams jump straight to action without classifying — and the action they take is right for one kind of problem, wrong for another. Cynefin's discipline is classify, then act.
The unique insight is the five-domain split. Clear — cause and effect are obvious — best practice applies. Complicated — cause and effect require analysis — good practice applies. Complex — cause and effect are emergent — probe-sense-respond. Chaotic — no cause and effect — act-sense-respond. And a fifth — Disorder — meaning the team doesn't yet know which domain they're in. The most common failure in problem diagnosis is treating a complex problem as if it were a complicated one.
What you get when you classify is a different choice of next move. If your team is building bigger and bigger spreadsheets to analyse something that keeps surprising them, you're probably in complex territory using complicated-territory tools, and the framework will tell you to switch approach.
So. How to use it.
Describe. One paragraph. No jargon.
Test. Clear: would a competent stranger know what to do? Complicated: would a competent expert know what to do? Complex: do experiments produce surprises? Chaotic: is action required immediately, before you understand?
Pick. Most teams will land in Complicated or Complex. If you can't tell, you're in Disorder, and the move is to gather enough information to classify before committing further analysis.
Match. Complicated: deeper analysis, expert consultation, structured frameworks. Complex: small safe-to-fail experiments, multiple probes, sense-making rather than answer-finding. Different domains, different toolkits.
Re-classify. Problems move between domains as you learn. A complex problem can become complicated as the underlying structure becomes visible. Reclassify rather than persisting with the original domain.
The framework's main failure mode is using it once and forgetting it. The classification has to be revisited as the team's understanding deepens.
See the problem from where the user sits
The second is
Design Thinking.
Design Thinking as a structured discipline was formalised by IDEO and the Stanford d.school through the late nineteen-nineties and two-thousands, with the canonical sequence — empathise, define, ideate, prototype, test — codified by David Kelley, Tim Brown, and the wider IDEO writing. The deeper lineage runs through Herbert Simon's The Sciences of the Artificial in nineteen sixty-nine, product-design tradition at IDEO's predecessor firms, and the participatory-design movement of the nineteen-seventies.
The reason Design Thinking matters when you don't understand the problem is that the empathise and define phases specifically force the team to understand the problem from the user'S perspective before proposing solutions. Most stuck teams have skipped these phases and jumped straight to ideation, which is why they keep generating solutions that don't fit.
The unique insight is the user-perspective discipline. The problem has to be understood from where the user sits, not from where the team sits. The translation between the two is harder than teams expect, and the translation is what makes the difference between a defined problem and a misframed one.
What you get when you run the empathise-define discipline before ideation is a problem statement the team can actually act on. Not the product needs more features — team frame. Users in segment X are spending two hours a week on workflow Y because our product fails at step Z — user frame. The user frame produces useful solutions; the team frame produces solutions to problems the user doesn't have.
So. How to run it.
Empathise. Customer interviews, observation in their working environment, reading their support tickets in their own words. Not a focus group; not a survey. Direct contact with how they actually do the work.
Define. Write the problem from the user's perspective. User X is trying to do Y but is blocked by Z. Test the framing against the user — do they recognise themselves in it?
Ideate. With the user-framed problem on the wall, generate three or four candidate solutions per user — not per team meeting. The deliberate over-generation is the discipline.
Prototype. Cheapest version that lets the user respond to the candidate solution. Sketches, wireframes, click-throughs.
Test. Five users, an hour each, watching them try to use the prototype. The behaviour is the data.
The framework is what stops the team from solving its own version of the problem rather than the user's version.
Reach a tested answer inside a single week
The third is
the Design Sprint.
The Design Sprint was developed by Jake Knapp at Google Ventures through the early twenty-tens and codified in the two thousand and sixteen book Sprint: How to Solve Big Problems and Test New Ideas in Just Five Days. The framework compresses a normal Design Thinking cycle into a single working week — five days, locked room, key stakeholders, a validated prototype at the end. The discipline is structural rather than novel.
The reason Design Sprints matter when a problem is urgent and ambiguous and cross-functional is that the standard alternative — months of circular meetings — is the failure mode that ambiguous problems generate when left in normal office conditions. The sprint format works specifically because it prevents that failure mode.
The unique insight is the time-box plus presence. Five days is short enough that stakeholders can be locked in, long enough to do meaningful work. Locking the stakeholders in the room means the decisions don't get deferred to later meetings; the decisions get made because later meetings are not available. The format is doing the work the calendar has been failing to do.
What you get is a validated prototype at the end of a working week, against a problem that would otherwise have generated three months of inconclusive discussion.
So. How to run it.
Day one: Map. The team maps the problem space, the customer journey, and the constraints. The map is shared, on the wall, by end of day.
Day two: Sketch. Each person sketches solutions individually — no group brainstorm, no compromise. Multiple sketches per person.
Day three: Decide. The team votes on the most promising sketch. One concept moves to prototype.
Day four: Prototype. Build the cheapest version that lets a user respond. Click-through, paper prototype, wireframe.
Day five: Test. Five users, an hour each. Behaviour is the data. End of week, the team knows whether the concept works.
The framework's main failure mode is running it without proper stakeholder commitment. The sprint requires the decision-makers to be in the room for the full five days. If they're not, the sprint produces work the organisation later refuses to act on, which is worse than having not run it.
That's the toolkit. One more story before we close.
A precedent: when the real problem sits behind the symptoms
The Butterfield story we opened with showed the diagnostic moving sideways — the real problem turned out to be something the team had been treating as context. The story we close with shows the same move at a different scale — a strategy team that had been treating the wrong customer model as background rather than as the assumption the failing strategy was depending on.
San Francisco, two thousand and twelve. Rachel Tipograph joins Gap as the retailer's youngest director of global digital and social media. She is twenty-seven. The problem she is handed at the top level is legible but diffuse.
Gap's relevance to its historical core customer — the eighteen-to-thirty-four-year-old urban woman — is declining in ways the company's own metrics do not cleanly explain. Traffic is off. Stores are converting fewer visitors. The brand is losing the conversations it had once led.
The instinct inside the organisation is to diagnose by symptom. Fix the stores. Fix the marketing. Fix the product. Each fix is a discrete project with a sponsor and a budget and a quarterly review. None of them, individually, address what is actually happening.
Tipograph's first decision, described later in a two thousand and nineteen Gap Inc Newsroom retrospective and in subsequent founder interviews as she builds MikMak, is that the diagnosis has to start further back than the symptoms. Her internal framing is about narrowing down prioritisations — not fixing everything that looks broken, but identifying the small number of customer-behaviour questions whose answers the existing strategy is silently making guesses about.
The reframing she takes back to the leadership team is structural. The problem isn't Gap's execution against its understanding of the customer. The problem is that the understanding itself is out of date. The eighteen-to-thirty-four-year-old urban woman the strategy was built around is not the same person she was when the strategy was set, and continuing to optimise the strategy without revising the underlying customer model is the trap.
When you have hit a problem and you don't yet understand it well enough to solve it, the pressure is to start fixing things. The more useful move is usually to notice that the reason the problem feels insoluble is that you are fixing against the wrong model.
Tipograph's diagnostic wasn't faster than Gap's. It was slower, and it started further back. The question she was willing to spend time on — what does our customer actually do now? — was the question the fixes were all pretending had already been answered.
So. Your Next Move from this playbook.
Which problem on your roadmap is your team treating as Complicated — just throw more analysis at it — when it's actually complex, where you have to probe and learn? And how much time has analysis already wasted?
- Position
The situation in a sentence, and the feeling underneath it. Free to read.
- A choice of two Plays
Two behavioural Plays. Each positions you differently for the next conversation. You choose.
- A Plan of tools
Tools from the Toolbox, in order, each ending in Your Next Move — one concrete instruction.
- Precedents
Leaders who stood here. We show whose play worked, half-worked, and shouldn’t have been attempted.
“The list was never the hard part. Standing behind the cut, in the next three conversations, is.”
Sources & further reading 3 Positions, 4 Plays, 3 Plans, and 2 Precedents.
Your Next Move
Questions, answered
How does a Playbook work?
A Playbook names your Position, hands you two Plays to choose between, then turns your choice into a Plan — a sequence of tools, each ending with a single concrete move. It closes on Your Next Move: the one thing to do before the day ends.
How long is a Playbook?
About twelve minutes. Short enough to watch in the gap before the meeting it’s made for.
What’s the difference between this and asking AI?
A chatbot gives you an answer. A Playbook gives you a Position, a chosen Play, a Plan, and Precedent — the structure of a decision, not a paragraph of advice. You open the situation you’re in rather than describing it from scratch.
Do I need to watch them in order?
No. Each Playbook stands alone. You open the one that matches the situation in front of you — there’s no sequence to follow and nothing to complete first.
What is Your Next Move?
The single concrete move you leave with — a question to take back into the room and answer there. Every tool in a Plan ends with one. It’s the answer to the question the brand name asks.