Quocknipucks, or, why story points make sense. Part II.

Last time, I set the stage for why Quocknipucks (OK, I mean story points), despite being the target of recent severe Agile backlash, actually do provide a sensible and workable solution to the two most difficult aspects of software team sprint and  capacity planning. I elaborated on the ways that Quocknipucks story points solve these two problems, in that they:

  • Enable us to gauge the team’s overall capacity to take on work, by basing it on something other than pure gut and/or table-pounding; and
  • Enable us to fill that team capacity suitably, despite having items of different size, and, again, basing our choices on something other than pure gut.

But there’s lots more to cover. I have more observations about the role of story points, and I want to provide some caveats and recommendations for their use.  And it’s also worthwhile to list the various objections that people routinely make to story points, and provide some common sense reasons for rejecting those objections.

Provisos, bold observations, and recommendations

  • Yes, of course story points are addable — because that’s really their main use/purpose: to be collectively weighed as a total against a notion of total team capacity. You can’t do that effectively with a disparate bunch of t-shirt-sized items. I’m mystified about why we see such monumental outrage about “you can’t do math with story points”. Here’s why you can:
    • They’re a rough approximation of magnitude.
    • They’re intentionally a numerical approximation, correlated to size/effort, even though imperfect. When I’m weighing a slate of stories against an assessed total team capacity for a given time frame, of course I sum up the story points for the individual items: that’s the whole point, as long as one keeps in mind their nature as a rough approximation.
    • The very reason for choosing a numeric representation for story points is to have that kind of arithmetic make sense; otherwise we could have used colors or animals or funny icons (or whatever) instead.  And it’s related to why we typically use a Fibonacci scale and don’t use a straight arithmetic progression such as 1, 2, 3, 4, 5, 6, etc. The idea is to make the assessment of different items correlate clearly (even though approximately) to discernible size jumps, not to argue incessantly over whether a given item is a 5 vs. a 6.
  • Don’t think in terms of time needed when assigning story points to an individual item. That defeats the purpose of abstracting time away. People are notoriously inaccurate when estimating in time, and when they attempt to do so, their subsequent near-automatic step is to translate time mentally to money. Using story points provides one more layer of abstraction that helps stop that. Focus on the relative size of each item, not a direct translation to time. The goal is to provide the by-item detail that will collectively help us fill a sprint to (near) capacity, not to hit any specific time target.
  • Now, this whole approach works only if story point assessments are made in good faith, which is why the oft-tweeted NFC and TFB cards (rather obnoxiously and extremely counter to the spirit of collaboration) miss the point. Somehow, the same people who insist that we need to trust developers to self-manage and always work on the right thing also argue that those developers will tend to “game” any story point assessments so that the discussion will be over more quickly or so that they’ll have less work assigned.

Attitude matters

  • In short, the team has to take it seriously. No one should expect useful estimates from bad attitude. The thing is, it’s actually in everyone’s interest to take it seriously. People should intuitively see the benefit (for everyone, including the team itself) in right-sizing the workload for the team, and in making sure that expectations are aligned with what is feasible. In fact, my biggest objection to planning poker as an Agile ritual is that the metaphor itself is that of a game, which unfortunately evokes a win-lose mentality. Especially if poorly facilitated, planning poker “gamifies” the necessary analysis so much as to reduce the level of seriousness with which the team regards the endeavor.
  • Equally, management has to take it seriously, by understanding the notion that a capacity limit actually exists. In fact, one little-touted advantage of story points to the dev team is that they are the team’s best concrete defense against being overladen with work simply because a manager pushes hard for more more more. If you consider that benefit, the dev team should be among the most avid advocates of using story points to aid capacity planning.
  • Don’t overstuff the sprint to the full anticipated capacity; leave some margin. Over time, the team will get a good sense of what their capacity (in total story points) likely is, and will know how much buffer it makes sense to leave vacant. And they’ll have a good idea of how to adjust the target amount if there’s an absence (planned or unplanned) on the team during a sprint.
  • As with anything, expect things to be off a bit. Adjust constantly: inspect and adapt. Jettison a story from the sprint at the last minute if you have to, but recognize that having to do that is a sign that something went awry in your assessments that you can and should likely address in a retrospective.
  • Finally, don’t try to equate or map or compare story points across teams. They’re an approximation, specific to that team and that particular point in time. In fact, expect them to drift over time (a “2” last year may not be the same as a “2” right now, for example). They’re simply a tool, a rough assessment, not a measurement. In fact, much of the confusion and agitas surrounding story points can be tied back to the false understanding that they are somehow fixed and deterministic.

Answering the objections to story points

What are the arguments that get raised against story points? Pretty much the same as the shots taken against estimates in general, but lately even more virulent:

  • Story points are viewed as “arbitrary”
  • They’re hard to explain to end users
  • Users want to translate them into time
  • They’re too hard to determine for each story
  • Sometimes they turn out to be wrong
  • People determining them sometimes just spout out a number, just to be done with it, and don’t really take the time to think through the story’s nuances and difficulties

Common sense answers to the objections

Let’s talk about some answers to those objections, one by one.

  • “Arbitrary”: well, “arbitrary” is a word that many people unfortunately like to use when what they really mean is “not precisely measurable; based on judgment.” But that’s not what the word “arbitrary” actually means. It’s an alternative and utterly binary definition, colored by a value assessment: precise=good, imprecise=arbitrary. But the world isn’t binary, and the mere fact that something is judgment-based isn’t a disqualifier for its usefulness. For example, court rulings are also based on judgment and can be called “arbitrary” in that same (wrong) sense, but they are anything but random, since they’re based on careful reading of the law, applied to the given situation by experienced professionals. So too with story points: that they lack absolute precision or that they require judgment does not mean they lack usefulness. They’re certainly not arbitrary in the sense of “randomly chosen without due thought”, or, if they do get randomly chosen in a specific instance, that’s an issue with the attitude or training of the people determining them, not with the concept itself.
  • “Hard to explain”: yes, story points can be hard to explain to some folks. But if you explain story points in the context of team capacity, most business people innately understand the basic concept. Anyone who has observed a workgroup in just about any domain (say, staffing a restaurant for the varying crowds during a day) will not have trouble seeing that using a quantitative model can be useful despite being imprecise and despite not being guaranteed. I’ve never found an end user who couldn’t quickly grasp the benefits of approximated capacity planning for workloads that aren’t precisely predictable.
  • “Users want to translate story points into time”: That kind of behavior tends to happen mostly when things have gone awry. Almost always, users tend to focus on finding out when they’ll get their desired capability, as a whole. If it gets to the point where they themselves are taking story points and then calculating the likely time for individual stories, they’re far too into the weeds of the “how”, and the reason they got there is probably that they weren’t getting direct answers to their very reasonable questions about when they could expect completion for their capability as a whole. So, users trying to translate story points into time is really a symptom of poor team and/or product owner communication with those users, not an inherent issue with story points themselves.
  • “Story points are too hard to determine for each story”: this objection often falls away if there is appropriate facilitation of a planning/estimating session, designed to identify the various damaging factors such as wrong-headed striving for precision, participant fatigue, encroaching biases, occasional HiPPO influence, the “alpha wolf developer” element, etc. Pay no attention to intentionally incendiary anecdotes such as “oh, we argued for an hour about whether the story was a 3 or a 5”: those tales don’t reflect a true problem with story points, but rather are a sign of a poor process, and most definitely poor facilitation/leadership of that process.
  • “People just spout out a number without thinking”: Well, the sensible advice is don’t do that, and don’t let the team do that. Again, adequate facilitation addresses this problem. When it occurs during a planning session, it’s generally very obvious that it’s happening. Time to take a break, and/or to counsel people in participating meaningfully instead of by rote.

Don’t confuse sizing and scheduling

  • “Sometimes the estimates turn out to be wrong”: indeed they do, sometimes. Imagine how crippling that possibility of “I might be wrong” would be for a weather reporter, or a surgeon, or a baseball pitcher, cowed by the very prospect of error. As I’ve written before, if you’re paralyzed by the notion of occasionally being wrong, you really shouldn’t be making business decisions at all.
    • Even worse along the lines of “the estimates might be wrong”: #NoEstimates advocates will assert that story point assessments are completely useless. They “prove” that point by showing scatter plots of item size versus actual end-to-end item delivery time, with the intention being to demonstrate that the total time for actual completion (i.e., measured from conception to delivery) of an item isn’t correlated to the story points assigned to that item. Their triumphant conclusion is then that “size matters little”.

But thinking “size matters little” is wrong-headed, and here’s why. It reflects a misunderstanding that story points somehow represent an estimate of when something will be delivered; but they don’t. No one ever said that “the smaller the item, the sooner you get it,” and that of course doesn’t conform to common sense in software or any other domain. There are dozens of valid, common reasons why a specific large item might be delivered far sooner than a smaller item; that’s a scheduling issue, not a sizing issue, and it is influenced by multiple factors such as dependencies, priorities, external delays, interruptions, changes in staffing, etc. Yet that does not mean that size itself is irrelevant when juggling what gets put into the schedule in the first place.

It shouldn’t be hard to see the intuitive and universal sense behind the overall concept of “if you take on bigger items, you will generally get fewer of them done in the same time period than if you choose smaller items.”

Again, the key is thinking in terms of filling up capacity appropriately. Story points help you do that. But as with any kind of estimate, they should not themselves be mistaken for a schedule, a guarantee, a commitment, or a promise. Appropriately understood and used, they can help you arrive at those things, but they’re not those things in and of themselves.

The core puzzle

The core puzzle: whether you do or don’t favor story points as the specific approach, why would anyone want to purposely exclude size of the various items as one important factor when determining what will be a manageable slate of work for a given time period?  Can such folks really be that uncomfortable with the fairly obvious notion that planning always has to incorporate some element of judgment call about what items to include or exclude, after making and weighing approximations of approach, impact, benefit?

If you exclude taking item size into account, how then can you possibly align the chosen work to team capacity, much less gauge what that capacity really is? In short, how does ignoring size make any sense at all? What are the alternatives, besides the simplistic notion of just engaging in serial work: slogging through one item at a time, chosen without considering its size (or worse, picking just one slice of the overall item), and hoping for the best? How is that a plan?

My answer is that no one has made a plausible case (other than through frequent and adamant assertion) for any viable alternative. Some form of sizing always needs to enter into the discussion when slating a body of work, in software or any other domain. If you don’t consider sizing explicitly, it happens implicitly somewhere along the line. As I always say, I prefer doing things on purpose and with appropriate transparency (and debate) among all participants.  Transparency, of course, carries with it the possibility of being wrong, and having to justify one’s decisions. Accountability, in short.

As discussed above, story points are actually one of the most reasonable and practical solutions to various issues/problems surrounding capacity planning and filling. And that may be why they are now attracting particular animosity, from people who appear to be perhaps by nature opposed to estimation of any ilk. Workable solutions, especially in the estimating arena, are often lightning rods for criticism from people who staunchly fight the very concept of estimating at all.

Stand down, story point bashers

Bottom line: let’s stop bashing story points. Stop claiming they’re “widely discredited.” Because the idea is sound, and if a well-facilitated team makes a bona fide effort to use story points appropriately, they work well as a way to roughly align workload to team capacity. Which is a highly desirable outcome for all.

And finally, although pride of ownership makes this hard to admit, “story points” is a far better term for the concept than “quocknipucks”.

Speak Your Mind


This site uses Akismet to reduce spam. Learn how your comment data is processed.