Effort Estimation Techniques
Effort Estimation Techniques
My starting point for agile estimation was the book “Aufwandschätzung bei Softwareprojekten”, Steve McConnell, 2006, Microsoft Press, However, at the end I found the classification by Boehm (Barry W. Boehm, Software Engineering Economics, Englewood Cliffs, NJ : Prentice-Hall, 1981).
In my blog I’ll follow the classification of estimation methods from Boehm 1981 (Barry W. Boehm, Software Engineering Economics, Englewood Cliffs, NJ : Prentice-Hall, 1981):
- Algorithmic cost modeling
- Parametric Models (e.g. COCOMO)
- Function Points / Lines of Code
- Proxy Based
- Process Simulation
- Estimation by analogy
- StoryPoints
- T-Shirt Sizes
- Expert judgment
- Single experts
- Group of experts (Wideband Delphi, Planning Poker)
- Variances and subsidiary techniques
- Top-down / Bottom-up estimation
- Combinations of techniques
- PERT / Fuzzy estimation
- Parkinson’s Law
Let’s start with some definitions:
An estimation is an approximation based on input data that may be incomplete or uncertain (Dictionary, Wikipedia).
If the estimation is given as a single value (“one-point-estimation”), it’s assumed that there is a 50% probability that the real value is higher, 50% that it is lower than estimated.
It is more accurate to provide an estimation range, with a min, max, and confidence intervals. However this requires more mathematics.
Note: An estimate is different from a project plan: The project plan is designed to hit a target that is a statement of a desirable business objective. A commitment is a promise to hit the target.
Analysis is the act of breaking something into parts to get a better understanding of it.
Law of Large Numbers (LLN) The average of the results obtained from a large number of trials is close to the expected value. Therefore you gain more accuracy, if you involve a group of experts rather than one expert.
Economy of scale Refers to the cost advantages that an enterprise obtains due to expansion.
Diseconomy of Scale Estimation of smaller projects / demands do not scale to bigger projects / demands, due to e.g. communication/management costs, duplication of effort
Closer to the end of a project, uncertainty becomes smaller. This can be visualized as a “cone of uncertainty” – e.g. look at http://construx.com/Page.aspx?cid=1648
Now I’ll present an overview of the estimation methods:
One big class of estimation techniques are algorithmic methods: Algorithmic methods use mathematical relations/formulas for the estimation. The formulas are based on research and historical data and use inputs such as Lines of Code (LOC), number of functions to perform, Defects, … you get from analysis. The advantage of these methods is that they are very precise and easy to apply. The limiting factor is the availability of the input data. Also they are unable to handle exceptional conditions, input data might not be available or of poor quality.
Also (dynamic) process simulations are algorithmic methods: They use a dynamic model with assumptions about the project and organization, e.g. their velocities and error rates (need to be calibrated with real data).
Another class of estimation techniques is estimation by analogy: They compare new tasks / project with other tasks / projects already known to derive the estimated effort from historical data. One needs to find areas that can be counted (e.g. number of tables, screens, use cases etc.). Best known techniques areStoryPoints and T-Shirt Sizes:
StoryPoints
- Assign numbers to the categories that are related to the complexity
- Typical categories are
- Powers of 2: 1, 2, 4, 8, 16,…
- Fibonacci: 1, 2, 3, 5, 8, 13, …
- Story points are relative to a defined anchor, to compare to
T-Shirt Size (S, M, L, XL, …)
- Generalize the story point categories (maybe “8” story points do not exactly relate to twice the effort of “4”, e.g. due to diseconomy of scale
- The average size of a category is determined by historical data
The comparison is usually done by expert judgment: The judgment can relay on individual experts: developers, architects, etc. who are asked about the expected effort (one-point estimation) or ranges (min / max) or clusters. It can also rely on a group of experts – e.g. done in the planning Poker (Scrum poker) or poker party.
Last but not least there is the not so seriously meant estimation techniques, the Parkinson’s Law: “Work expands so as to fill the time available for its completion” (Cyril Northcote Parkinson, published in The Economist in 1955). Therefore, the cost is determined by available resources rather than by objective assessment. The estimated effort depends on the customer’s budget and not on the software functionality – e.g. If the software has to be delivered in 12 months and 5 people are available, the effort required is estimated to be 60 person-months.
For more variants of the Parkinson’s law see Wikipedia http://de.wikipedia.org/wiki/Parkinsonsche_Gesetze
Last but not least I’ll provide some variances and subsidiary techniques
Top-down approach
Split Requirements (Epics) into smaller elements (Stories) and assign some relative measure like story points, percentages. Split some (at least one) element further until you can achieve some good estimation
Bottom-up Judgment:
Break down into tasks and ask experts (developers, architects, etc.) about the expected effort
Best results, if tasks < 2d (otherwise details will be overseen)
Sum up to get the total effort. This yields to highly accurate estimated due to the Law of Large Numbers:
Fuzzy Effort Estimation
Fuzzy Numbers represent the physical world more realistically than single valued numbers (“Optimization Criteria for Effort Estimation using Fuzzy Technique”, Harish Mittal/ Pradeep Bhatia, 2007, CLEI ELECTRONIC JOURNAL, http://www.clei.cl/cleiej/papers/v10i1p2.pdf).
Some further reading I suggest: “The Comparison of the Software Cost Estimating Methods” by Liming Wu (http://www.compapp.dcu.ie/~renaat/ca421/LWu1.html).
StoryPoints
“Assign numbers to the categories that directly relate to the expected effort”
/// effort /// or complexity ?
effort = f( complexity ) ?
cu
@Boeffi
Thank you for the comment – I’ve updated the section.
Good compilation, Mirko! One question: Regardless of the estimation technique, how do you use the numbers that were estimated? Do you compare them against real numbers or do you use them as categories and measure how long something in that category really takes, on average? I wouldn’t recommend the former but rather the latter.
We use the estimation techniques
a) during the demand process to give our customers cost input for their business case calculation
b) to prepare planning sessions (effectively limit the input for the Scrum teams embedded in an outer waterfall)
c) in combination with real data used to measure performance (e.g. velocities, CFD)