“Definitions of #NoEstimates”? An enumerated list of counterpoints, Part I.

A week or two ago, we saw the first interesting new blog post on the bizarre and rancorous #NoEstimates movement in quite some time. Although that post is titled “definitions of #NoEstimates”, it’s not really “definitions” per se; it seems instead to be more of a mixed list of NE approaches (sometimes contradictory, as the author himself notes) and miscellaneous arguments that have been frequently made in favor of the movement. To the best of my knowledge, no such overall compilation has ever been made by a #NoEstimates proponent; as such, I applaud Jay Bazuzi for putting it together.

Of course, each of the described approaches/arguments has been outlined (and countered) individually many times before. But as far as I know, none of the major NE advocates has ever actually addressed any of the counterpoints to them, choosing instead just to block and insult the people making those counterpoints, often boasting proudly that they do so to “filter out the noise”.

In any case, let’s centralize those counterpoints now: here’s an item-by-item recap, springboarding off of Jay’s enumerated list of #NoEstimates approaches. For reasons of space and manageability, I’m splitting this rundown of counterpoints into two separate posts. Here goes:

1. Size everything about the same and count them, it’ll give you just as good results as estimating, without the cost of estimating.

It’s easy to SAY “size everything about the same”, but that’s devilishly hard to do in software, either in large chunks or small chunks. This is especially true if you try to adhere to the tenet that a completed story needs to provide real (and independent) user value. (And let’s not even cover the irony that “size everything about the same” incorporates (wait for it) estimating).

Let’s walk through a fictitious example. Say you need to add a new column to an on-screen grid, and populate it with specific information based on business rules that might differ from row to row. One “chunk” (sliced story) might be just adding the physical column to the grid. Value added, right? Not really. An empty column does the user no good at all. However, it’s probably relatively quick to do, compared to (say) actually populating the column according to the necessary business rules.  Now think about the business rules: maybe some of those business rules requires new data amalgamations of some kind (client side for some, server side for others, perhaps) and will thus require significantly more development and testing time than others. And some of the rules may turn out to be almost trivial to implement.

OK, so maybe in this example you can “slice” the story so that each business rule is a separate story. But, does that really provide independent user value in each slice? Some NE advocates will argue it does (some results being better than none), but in my experience, users want to see their grid populated correctly for all rows, not just a subset. It wouldn’t be logical to argue that having just a few of the gauges and controls in your car actually working would be acceptable in any sense to the normal driver. So slice away, but I submit that you’re going to have great difficulty slicing the needed full capability (a revised grid with the new column populated) into meaningful, truly-user-valued deliverable chunks. Don’t get me wrong: slicing is a smart way to divide up the actual work, but recognize that you’re probably going to have to compromise (often) on the “has independent user value” aspect for each slice.

But here’s the point: as you should be able to tell from that example, it’s not uncommon for sliced, supposedly “small” stories to widely vary in size, from taking 10 minutes to taking (say) 2-3 days to complete. That’s two orders of magnitude in size difference (a reminder to my NE friends: one order of magnitude is 10x), which completely destroys any argument that “counting will give you just as good results as estimating.”

2. Split stories as small as you can and then count them, it’ll give you just as good results as estimating, without the cost of estimating.

Same basic counterpoint as above: there are potential orders of magnitude of difference in slice size. And what’s this vague hand wave of “split stories as small as you can”? Again, if sliced stories are meant to have independent user value, that user value and sliceability is determined/bounded by the nature of the envisioned capability. Some stories will not prove very sliceable at all – my canonical example is when Twitter expanded the length of tweets from 140 to 280 characters.  There were of course a lot of related tasks and subsystems to change across the Twitter ecosystem, but there was no conceivable user-visible middle ground of partial delivery. Maybe they decided to defer work on some of those subsystems, but they still would have had to put in the necessary work to carefully confirm that such deferral was technically possible without dire impacts.

Moreover, if you’re going to base your “results” (by which the notion of forecasting is meant here) on story counts, then when do you slice your story? There are only three alternatives (up front, just-in-time, and trying to dodge the dilemma by using a “splitting factor”), and they all have notable issues:

    • If you slice stories up front for all envisioned capabilities, that’s BDUF by definition. You might never implement some of those stories by the time you reach them, or their definition/approach might change enormously by that time, so including their count in your forecast will throw it off.
    • If you slice stories just-in-time as you tackle/slice each original story, then what can you “count” when forecasting the completion of the whole set of stories that you have yet to take on? You don’t yet know the end count of stories. Your forecast contains apples (original stories) and oranges (split stories).
    • If you forecast your so-far-unsliced stories by boosting their story count via using one or more “splitting factors” (e.g., determining that most original stories have historically turned into an average of 1.78 stories once sliced), now you’re just playing with math and coarse-grained conversion factors, while studiously ignoring that there may be glaringly non-average stories that remain to be tackled. Again, original stories (especially epics) can vary in size by orders of magnitude. Spreadsheet calculations often feel nicely plausible and comforting, but they aren’t always the answer, and discarding the human judgment factor by outsourcing it to mechanical calculation has great risks. Reflect, for example, on the many times you’ve been frustrated by the Windows progress bar.

Summary: counting stories to use in forecasting, without taking story size into account, is a chimera of a real alternative to estimating. Using data is great as one way to triangulate on the “when will we have it” answer you need, and it’s recommended for your toolset, but don’t depend on it exclusively. Especially, don’t have that approach extend to accepting the bizarre implicit NoEstimates “taboo” on taking story size into account when planning or even discussing.

3. It’s useful to estimate whole projects so you can decide whether to fund them, but don’t waste time estimating day-to-day activities.

Apart from using what is universally acknowledged to be the most dangerous and ill-advisable estimating technique (i.e., raw gut: “oh, that’s a four month effort”, often spouted off in the hallway with little or no thought), no one can viably estimate whole projects without doing some form of analysis/decomposition, whether that’s back-of-napkin or something more formalized. And once you do that, it’s natural to size those individual chunks and roll them up to a whole, informing your overall estimate. That’s an advisable and more granular approach than “whole projects”, but it isn’t estimating day-to-day activities.

But think about the straw man being implicitly evoked here: estimating the likely cost/impact of anything is natural and useful, whether that’s done holistically or for an individual task. In other words, the question of “how long will this likely take” is really never out of bounds. Why would it be? Asking “do you think you can be done with that by Friday, or will it take longer?” is not, as NE tends to depict it, an indisputable example of Evil Oppression By The Man. Equally, asking “why did our actual effort on this story turn out to be three times what we had collectively anticipated” is an utterly reasonable question, fully worthy of coming out in a retrospective, with perhaps important lessons to be garnered about what the team might have missed and what it could now take into account in the future.

Again: why would there be a taboo?

4. Estimate day-to-day activities because they are simple enough to understand, but don’t try to estimate whole projects which are full of unknowables.

“We can’t predict the future” is the constant lament of the #NoEstimates advocates. But predicting the future is exactly what we are all called upon to do in business in general: to take our best shot at anticipating factors such as demand, supply, sales, overhead, seasonality, and more. We don’t get to choose NOT to, in fact. How many widgets should we manufacture to meet the likely holiday demand? How many customers do we need to sign up at a particular price point in order to break even? What’s the likely impact of dropping our price by 20% to undercut the competition? All of these require coarse-grained judgment calls. Even though they’re “full of unknowables”. We’d better figure out ways to make such calls, if we want to stay in business.

5. Estimates are unreliable, so we shouldn’t use them to make decisions. That would be irresponsible.

See the above counterpoint to #4. “Unreliable” seems to be used, in this #NoEstimates argument, as a synonym for “not guaranteed to be absolutely correct”. But in the world of business (or life), there are never any guarantees. We seek the best information available in order to make the necessary decisions, in the hope of tilting the odds of success in our favor; making estimates of size/effort/impact/benefit is simply unavoidable. If you have to make a decision about anything in the future, and you do, you are in fact estimating in some form, even if you steadfastly refuse to admit it.

6. Estimate value, not cost. Value varies by orders of magnitude. High-value work will exceed the cost by so much that cost won’t matter.

“Cost won’t matter” can be true (sometimes) for a few rare “big bang” products (think iPhone etc). (Also note that those sorts of projects tend to be huge corporate gambles as to their probable success: compare the Amazon Fire Phone experience to the iPhone).

But “cost won’t matter” is simply not true for the vast majority of software projects, which tend to be incremental add-ons, internal capability enhancers, enablers of integrations with other products, etc. For most things we consider taking on, cost does matter, particularly opportunity cost (i.e., what else we’re NOT working on as we tackle a particular project). Weighing the whole picture in advance (the likely cost and the likely resulting value) is a far wiser approach than just “going for the shiny”, being seduced by the hope of big gains. “Going for the shiny” with no regard for its cost or time frame, thinking optimistically that those aspects “won’t matter” because you’re sure your reaped value will be so huge, is a great way for firms to go bankrupt.

So there you have it for the first half of the collected “#NoEstimates Definitions” that Jay listed: item by item, approach by approach, the specific counterpoints to each. What’s most interesting about the debate at this point is that, aside from hurling insults, name-calling, and blocking, NE proponents cannot and do not ever respond to these counterpoints with actual points to argue their invalidity. Which is the most telling behavior of all.

Stay tuned for Part II, where I’ll cover the remaining 5 bullets from Jay’s original list, plus adding a couple of common NE arguments that he omitted.

Speak Your Mind


This site uses Akismet to reduce spam. Learn how your comment data is processed.