An Estimate is a guess in a suit, you can do better than that

New project, new team, new opportunities, a fresh start for everyone, a room full of hope and optimism. Then someone senior comes in and asks for an estimate for the full scope of work – and things start to spiral downwards.


There are three answers to this question, the right one, the wrong one and a refusal to answer. The right answer is impossible, even if you do get it right it will soon be wrong as the scope, and hence the question, will soon change, which leads us to the second answer – a wrong one and for the majority of circumstances this is what is provided.

The problem with an estimate is both the background from where it comes, and what is done with it. If you have a heavily caveated range that is used to inform medium term planning with an awareness of the risk, then that is great. If it is a guess that is treated as a commitment – well we all know the trouble that causes – but people still ask for them.


So firstly why are we so poor at estimating with the software industry? There are other industries out there that appear to be able to get things done against simple predictable plans – yes a few things slip, but houses get built, gas pipes are laid and aircraft get assembled. The important difference is one of experience. Software is repeatable at very little cost and effort – CTRL C, CTRL V, this means that the majority of large software projects are new – never been attempted before by definition. Therefore the solid experience that drives the confident project planning in the industrial sectors is absent in the software industry. Software is now largely a creative / knowledge based activity, like graphic design or management consulting. (It is important to note that other engineering disciplines also have these issues when attempting something unique, so automotive design, large bespoke construction projects etc).


So what is the solution? It isn’t realistic to respond to every request for an estimate of when something will be ready with a wise frown and, “it will be done when it is done”.

Existing and well discussed techniques such as Story points and then breaking user stories down into Tasks and estimating those Tasks in hours is a valid approach but suitable only for the current planning horizon and is unfeasible for an entire backlog, indeed to attempt it for the entire backlog would require such extensive analysis and design you are pretty much back at waterfall – the scope will likely change before you finish.


Understanding the issues with estimation is more a psychology than technology challenge. Typically we estimate by drawing parallels between the work in question and prior experience, but humans are naturally self centred and optimistic, we exaggerate the parallels between this work and previous, undervalue the substantial differences and have a rose tinted memory of how it went last time.

The scope of work to be estimated can be considered in the context of the KNOWNS, these are the same “Knowns” that Donald Rumsfeld referred to on US defence policy, although he only mentioned 3 of the 4.


These are:

  • Known Knowns, these refer to deliverables that are well understood and the premise that short term user story estimates are still valuable.
  • Known Unknowns, these are deliverables that we are aware there will be issues and problems with and those problems are not solved, could be easy, could be difficult but fundamentally require investigation to understand. These requirements are the reason we apply contingency or a margin of error on the Known Known estimate – but without any possible logic as to what that margin should be. The advantage here is that the team will have a fair idea about how to improve their understanding of total work and what activities will help them to become more precise.
  • Unknown Unknowns, the black swan events. These refer to issues that are not understood and not even known to exist. Problems that nobody has even considered would occur and usually lie well outside the realm of contingency planning. These issues may have minor impact, or could completely
  • The last and the most pernicious of the four, and for that reason the one that Mr Rumsfeld wasn’t brave enough to mention, are the Unknown Knowns. These are things we know but choose not to accept or allow for because recognition is so disruptive to our social construct that it is more comforting to create an illusion where they do not exist. In wider society things like state oppression of minorities are often raised as examples, in the less dramatic world of software delivery, issues like stakeholder politics would be a better example. When estimating deliverables it is important to try to surface these as much as possible and be honest about their influence – they typically supress estimates.


A solid appreciation of Complexity Theory, and an awareness of the “Knowns” should enable us to look at the work to be estimated from an informed perspective, and should give us good, communicable reasoning as to why a firm estimate of complex software deliverables beyond our planning horizon is so difficult as to be fruitless. However, good examples of estimation techniques used on complex (unpredictable) systems do exist – the best of these is the weather. The weather later this week is projected, not by a group of experienced meteorologists given today’s information, but by adding that information to all previous information and passing it through a very complex model that is continually evolving. This approach can be applied to our software delivery to deliver long term estimates but now we can appreciate the difference, what we would be providing is no longer an estimate – a guess in a suit, but a forecast. A FORECAST is a statistical likelihood of something happening given historical information and a set of input data. The Monte Carlo simulation approach is a well documented version of this and can be simplified dramatically to be easily employed and still gives very helpful, and importantly fast, forecasts for software delivery.

All forecasting tools rely on data, and therefore before any forecast can be delivered, the team need to make a start on the delivery and record their performance. Once the team have delivered 10 items, as long as those items were not chosen based on expected size, then the probability of the next item taking longer than any previous item to deliver is 5%. Assessing the full list of deliverables in this light – taking the 50% mark would enable a team to rapidly give a most likely forecast and bound it by X% either way. Then after each additional item is delivered the model is improved, and the remaining work reforecasted refining the given result.


So when someone asks your team for an estimate, the first thing to do is have a discussion to see if this work could be described as a KNOWN KNOWN, and deliverable within your planning horizon. If so, then proceed with a breakdown by User Stories and Story Points, compare to velocity and give a duration with heavy margins of error.

If the work is less well defined or substantially larger, then divide it up and compare against your historical delivery through a statistical model. If you have no model, because you are a new team or the work is completely different to anything previously undertaken, then you have to have the awkward but honest discussion explaining that you can’t give an estimate until after you have started, “so give us a month to deliver something of use and we’ll then be able to start to understand enough to give future projections”. Now if that isn’t acceptable then I suppose you could guess what you think you think they want to hear, and revise that figure after a month or so with something more credible – good luck with that.


Keep in touch on #philagiledesign