I'm being asked for metrics and I'm not sure which ones matter.

By Will Rowan, Editor · Updated 29 June 2026 · What’s Next

01Position

“I'm being asked for metrics and I'm not sure which ones matter.”

The feelingSwamped.

I'm being asked for metrics and I'm not sure which ones matter. A leadership Playbook film: where you stand, the Play to choose, the tools in sequence, and the leaders who made the same call. Captions available.

If that’s where you are right now, this is the Playbook built for exactly that moment.

“Which metrics?” is one of 40+ What’s Next? Playbooks, for leaders facing a specific, real situation. In under fifteen minutes it helps you recognise what’s actually going on, then gives you a clear way through: the Play to choose, the Plan in concrete moves, the Precedents of people who faced it before, and your next move.

Frameworks you’ll see put to work on this exact decision, applied, not taught in the abstract:

North Star Metric
Innovation Metrics
Pirate Metrics

You’ll also see how it played out in the real world, Bobby Ghoshal at Flud, Bay Area (2013), and Sarah Bryar at Rivet & Sway, Seattle (2014). Real precedents, not platitudes.

It leaves you with one question to carry into your next conversation: “If your top KPI moved 20% next month, would you actually do anything different - or have you been tracking”

Part of the Measurement & Review collection, Playbooks for when the metrics are unclear, the retro repeats, or the dashboard doesn’t match reality. See them all ›

Related Playbooks One-page strategy · Customer not buying · Project slipping

Transcript — read it in full

What to do when you are asked for metrics and not sure which ones matter

The Bay Area, two thousand and thirteen. Bobby Ghoshal is running Flud — a beautifully-designed iOS social newsreader, launched into a market that is about to be dominated by a handful of much larger players.

The team is talented. The design is widely praised. The press coverage is exceptional. TechCrunch writes about Flud. The Verge writes about Flud. They win design awards. Every external signal that startup teams reach for as evidence of progress is in their column.

In Ghoshal's later post-mortem, he names the specific mistake the team made with their attention.

The press coverage was easy to measure and felt like progress. Every new article was a discrete event the team could celebrate. So they optimised for it. They timed their releases to maximise tech-press hype. They built features that would generate quotable lines. They invested in PR relationships and press-kit design.

The actual product question — what their users actually wanted from a social newsreader, and how those users behaved with the features Flud was shipping — was harder to measure. So they didn't measure it.

By the time they noticed that their users wanted influence features — the ability to be read by people who mattered, not just to read what people who mattered were sharing — Pulse had already shipped the influence features that captured the market, and Flud was a beautifully designed app that nobody used.

The metrics that matter are usually the boring ones. The metrics that get tracked are usually the ones that feel like progress. The trap is that those two sets of metrics often have nothing to do with each other.

Why the boring metric beats the one that feels like progress

When you have been asked for metrics and you are not sure which ones matter, the question is rarely which dashboard to build. It is which behaviour, in the user's actual day, would your business need to see for the metric to be honest evidence that the work is working?

So let's go to the office and work through it.

Start by reading what leadership is really asking for

"I'm being asked for metrics and I'm not sure which ones matter."

The feeling is swamped.

You have a dashboard. The dashboard has twenty numbers on it. Leadership wants three numbers. The conversation about which three keeps stalling because nobody can agree which of the twenty are decorative and which are actually telling you whether the work is working.

Two choices. Same metric question. Different underlying intent.

When they want proof of value

Choice one: leadership wants proof of value. The question underneath the question is are customers actually getting something from what we're doing?

If that's the read, delete every metric on your dashboard except the one that measures real customer behaviour. Not visits. Not sign-ups. Not dwell time. The specific behaviour that defines success for a user who is getting value, and nothing else.

Sean Ellis, formalising the North Star metric discipline, has been clear about the test. The exercise is painful because most teams have twenty metrics and cannot tell you which one matters most when asked. The deletion forces the answer. If you cannot identify the single behavioural measure that counts, you don't have a metrics problem — you have a value problem, and the dashboard has been hiding it.

When they want proof of activity

Choice two: leadership wants proof of activity. The question is whether your exploratory work is producing anything, and they are reaching for revenue and roi because those are the numbers they know how to read.

If that's the read, refuse to give them those numbers. Give them a failure rate instead. Here's how many experiments we ran. Here's how many of them failed. Here's what we learned from the failures.

Amy Edmondson, on what she calls intelligent failures, names the principle. A team doing genuine exploratory work should have a high failure rate, and the number being high is the evidence that the work is real rather than theatre. Handing leadership a revenue number for exploratory work trains them to ask for it next time, which is exactly the dynamic that kills innovation budgets over time.

Proof of value, or proof of activity. Same metric request. Two different first moves.

How to design metrics from outcomes down

Three tools. The discipline is to build the measurement stack from the top down, not the bottom up.

The first is

Pick the one number that tracks real value

North Star Metric.

The North Star Metric was popularised by Sean Ellis through his growth-marketing writing of the early twenty-tens — Ellis is the founder of GrowthHackers, formalised the growth hacking term, and ran the early growth teams at Dropbox and others. The deeper lineage runs through the single-most-important-metric tradition in product analytics and the lean-startup writing of Eric Ries on the one metric that matters.

The reason the North Star Metric matters when leadership asks for metrics is that the team can give them one number that genuinely tracks whether the product is creating value, instead of twenty numbers that each tell part of a story.

The unique insight is the singular focus. The North Star isn't the only number tracked. It is the one the whole team aligns to, with everything else being diagnostic rather than definitional. The diagnostic metrics still get watched; they just don't compete for attention with the headline metric.

What you get is a single behavioural number that, if it moves, tells you whether the user is genuinely getting value. Not visits. Not sign-ups. The action that defines success for a user.

So. How to pick it.

Define. What is the moment in the user's day when your product has actually delivered? Spotify: songs played. Airbnb: nights booked. Slack: messages sent in active workspaces.

Test. Does the metric increase only when real value is being delivered? A dashboard metric that goes up when users sign up and never come back is not the North Star. A metric that goes up when users do the thing your product was built to support is.

Single. The North Star is shared across the whole team. Not departmental. Not per-feature. One number, one direction.

Diagnostic. The other metrics on the dashboard explain why the North Star moved. Acquisition, activation, retention, referral, revenue — diagnostic, not definitional.

Re-pick. As the product matures, the North Star may need to evolve. The discipline is reviewing it explicitly rather than letting the original metric drift into irrelevance.

The second is

See which stage of the funnel is leaking

Pirate Metrics.

We unpacked Pirate Metrics in full at scenario twenty-three — AARRR, the funnel-stage diagnostic from Dave McClure's two thousand and seven startup-metrics presentation. Acquisition, Activation, Retention, Referral, Revenue. Five letters, five sequential stages a user passes through.

The reason Pirate Metrics matter here, after the North Star is set, is that they answer the question which stage of the funnel is leaking? The North Star tells you whether the value moment is happening. Pirate Metrics tell you where in the user's path that value moment is being lost.

A team with a North Star of messages sent in active workspaces and a Pirate Metrics map showing the leak at activation knows two things: the team is creating the right kind of value, and the path to that value is breaking before users can get there. The two metrics together produce a structural diagnostic the North Star alone could not.

The integration is the move. The North Star says what value. The funnel says where to act. Both are needed for the team to operate against signal rather than against opinion.

The third is

Measure the inputs when the work is experimental

Innovation Metrics.

Innovation Metrics as a structured discipline traces through the learning-and-experimentation tradition formalised by Eric Ries — we covered Hypothesis-Driven Development at scenario twenty-nine — and the innovation accounting writing in The Lean Startup, two thousand and eleven. The deeper roots run through the corporate-innovation-management literature of the two-thousands — Geoffrey Moore, Henry Chesbrough — on how to measure exploratory work without forcing it into operational-product accounting.

The reason Innovation Metrics matter when the work being measured is experimental rather than operational is that lagging revenue indicators only make sense for a mature product. For exploratory work, revenue is too late and too noisy to be useful. Inputs and activities are what matter — experiments run, hypotheses tested, assumptions invalidated, learning captured.

The unique insight is the input-output split. Operational metrics measure output — revenue, retention, NPS. Innovation metrics measure input — experiments-per-quarter, hypotheses-tested, time-to-validated-learning. The team running an innovation programme should be tracked on innovation metrics; the team running a mature product should be tracked on output metrics. Forcing the wrong measure on the wrong team produces theatre.

What you get when you measure innovation properly is honest visibility into whether the exploratory work is actually exploratory — running real experiments, generating real learning, killing real ideas — or whether it has quietly become operational under the surface.

So. How to pick them.

Experiments. Across the innovation portfolio, how many experiments have started? An innovation programme without active experiments has stopped innovating.

Falsified. Of the experiments that completed, how many produced no, the assumption was wrong? A team that never falsifies its own hypotheses is selecting against challenging tests.

Time. From hypothesis written to result read, how long does the cycle take? Shortening the time-to-validated-learning is what makes innovation programmes scale.

Pivots. At each gate of the innovation funnel — covered in scenario thirty-six — how many ideas pivoted, how many persisted, how many killed? The aggregate is the diagnostic for whether the funnel is doing its work.

Discipline. Don't track revenue on early-stage ideas. The temptation is enormous. The cost of giving in to it is that the next stage gets killed before the experiment is complete. Hold the input metrics until the idea has graduated to operational scale.

That's the toolkit. One more story before we close.

The Flud story we opened with showed the trap of measuring what is easy to measure — press, design awards, the metrics that feel like progress — while the metrics that determined whether the company would survive went unmeasured. The story we close with is closer to Flud in scale and decade: a startup CEO who knew the survival metric, watched it fail to move, and named that failure rather than burning the rest of the runway hoping for a different read.

A precedent: the dashboard that measured the wrong layer

Seattle, June two thousand and fourteen. Sarah Bryar took over as chief executive of Rivet and Sway — a Seattle-based online eyewear startup serving women — in mid two thousand and thirteen, recruited from a senior role at Amazon. By two thousand and fourteen the dashboard is healthy on every customer-experience metric the team tracks. Net Promoter Scores run consistently between ninety and ninety-five. Seventy-five per cent of website visits come from organic search, content marketing, direct return, and referrals. The reviews are good. The press is good. Customer relationships are warm.

There is one number Bryar cannot get to move, and it is the number the unit economics depend on.

Rivet and Sway's signature product is the home try-on programme — five frames shipped to the customer, four returned, one purchased, the entire round-trip absorbed by the company. The maths is unforgiving. To clear unit economics on the home-try-on operation, conversion needs to run north of forty per cent. Even at forty per cent, the cost per converted order sits at thirty-seven dollars and fifty cents before any customer-acquisition spending is added on top. The actual conversion rate runs well below that. Across the eyewear category structurally, ninety-six per cent of frames are still bought in physical stores. The channel itself is an uphill push regardless of how good the product is.

In June two thousand and fourteen Bryar shuts the company down. Her framing, in the GeekWire post-mortem, is specific about which dashboard had been the wrong dashboard: "We made the hard decision to close the company on a high note."

The metrics that looked good were genuinely good — at the customer-experience layer the dashboard measured. They were just not measuring the layer the business needed to survive against, and the gap between the two was the entire ledger.

When you have been asked for metrics and you are not sure which ones matter, the harder version of the question is not which metrics are vanity? The harder version, and the one Bryar's case carries, is which metrics, even if they were genuine, would not save the business if they were the only thing you could see? Flud's press was genuine but irrelevant; Bryar's NPS and retention were genuine and excellent — and equally irrelevant once the home-try-on shipping economics did not move. The metric that matters is rarely the one that looks worst on the dashboard. It is usually the one whose absence the dashboard does not flag.

So. Your Next Move from this playbook.

If your top KPI moved twenty per cent next month, would you actually do anything different — or have you been tracking it for so long that the number has become decorative?

What’s inside All 40 Playbooks

Position
The situation in a sentence, and the feeling underneath it. Free to read.
A choice of two Plays
Two behavioural Plays. Each positions you differently for the next conversation. You choose.
A Plan of tools
Tools from the Toolbox, in order, each ending in Your Next Move — one concrete instruction.
Precedents
Leaders who stood here. We show whose play worked, half-worked, and shouldn’t have been attempted.

“The list was never the hard part. Standing behind the cut, in the next three conversations, is.”

The close

Sources & further reading 3 Positions, 4 Plays, 3 Plans, and 2 Precedents.

Your Next Move

Buy this one Personal subscription Team subscription

Questions, answered

How does a Playbook work?

A Playbook names your Position, hands you two Plays to choose between, then turns your choice into a Plan — a sequence of tools, each ending with a single concrete move. It closes on Your Next Move: the one thing to do before the day ends.

How long is a Playbook?

About twelve minutes. Short enough to watch in the gap before the meeting it’s made for.

What’s the difference between this and asking AI?

A chatbot gives you an answer. A Playbook gives you a Position, a chosen Play, a Plan, and Precedent — the structure of a decision, not a paragraph of advice. You open the situation you’re in rather than describing it from scratch.

Do I need to watch them in order?

No. Each Playbook stands alone. You open the one that matches the situation in front of you — there’s no sequence to follow and nothing to complete first.

What is Your Next Move?

The single concrete move you leave with — a question to take back into the room and answer there. Every tool in a Plan ends with one. It’s the answer to the question the brand name asks.

Next on the shelf

Your next playbook

Open →

“I need to explain why we missed the target.”

Open →

“The dashboard says we're fine but I don't believe it.”

Open →

“The retrospective keeps producing the same three actions and nothing changes.”