The PM Soap Box

Robust Project Design - All the things that you were never taught about modeling projects.

Saturday, October 30, 2004

[16] Tolerance Design




At this writing, three methods for calculating tolerances enjoy some measure of support. They are: the Cut & Paste method, the Control Chart method, and the Root Square Error method. We discuss these now.

The cut and paste method of determining component tolerances is popular among the followers of Dr. E. M. Goldratt. According to this method, a project manager gathers the inflated estimates of task duration typically provided by developers and cuts these in half. The model of the project, then, is based upon the reduced estimates of duration. The component tolerances are estimated as a percentage (usually 50%) of the deterministic estimates of duration of their respective sequences of tasks. The project tolerance is estimated as percentage of the deterministic estimate of the duration of the longest sequence of tasks. Goldratt's followers refer to this longest sequence as the critical chain.

What’s good about the cut and paste method? It is overwhelmingly simple to use, as it requires only grade-school arithmetic. This simplicity seems to be of paramount importance to Goldratt and many of his followers, since the method can be taught even to the highly uneducated among us.

What’s not so good about the cut and paste method? The cut and paste method provides a linear model of variation. Unfortunately, so far as sequences of tasks are concerned, variation does not add linearly – only variance adds linearly. Variation increases with the square root of the number of tasks in a sequence. Thus, the linear model provided by the cut and paste method is inconsistent with sound mathematics.

Worst of all, the cut and paste method appears to codify the very practice that for decades has plagued the developers of virtually every product development organization: managers reduce the estimates of duration provided by the developers. Thus, it destroys trust between developers and managers, rather than building trust. It alienates the very individuals whose behavior has a direct and significant impact on the logistical performance of a product development enterprise. This alone makes the cut and paste method undesirable.

A second approach is to use a control chart. By this approach, we simply graph normalized values of project duration on a control chart. The planned (baseline) duration estimates of the projects are used as the normalizing values. For example, a project that had a planned duration of 100 business days and an actual duration of 140 business days would be represented in the control chart with a normalized duration of 1.4. The difference between the control limit and the mean of the normalized duration values serves as the basis for calculating subsequent project tolerances.

What’s good about the control chart method? It captures all the variation in project duration exhibited by an organization. Thus, the method gives us an accurate estimate of the required tolerance value.

What is unacceptable about the control chart method? Today, the resulting tolerance calculations would be impractically large and entirely unacceptable for all concerned. At this writing, finding even one product development organization that can be considered in a state of statistical control would be a very daunting task. The degree of variation in project duration exhibited by virtually every product development organization is unpredictable and astronomically large. Therefore, the control chart method, while sound and reliable, is simply impractical at this time. Perhaps it will be in use by the time that the two of you become interested in the subject of this book. I can only hope that the state of project management improves before then.

The third method for calculating tolerance values is called the Root-Square-Error (RSE) method. The RSE method is the same mathematically valid method that has been used by engineers for many decades, for the tolerance design of physical products. It is directly adaptable to the tolerance design of projects.

The RSE method is illustrated in the next figure. In support of the RSE method, each developer provides two estimates of duration per task. First the developer provides an estimate that corresponds to a high level of confidence. We call this the “safe” estimate. Then, the developer provides an estimate of the mean process time. We call this the “average” estimate. The difference, D, between safe and average estimates for each task gives us a measure of the expected variation for the task. The component tolerance is calculated as the square root of the sum of the squares of the differences, for the tasks in each component sequence. The same calculation also provides an estimate of the project tolerance, with the difference values being those that correspond to the tasks of the primary sequence in the project.



A sample calculation is provided in the next figure. For that example, the sum of the differences squared equals 758 business days. The corresponding project tolerance is 28 business days. Notice that the 28-day tolerance value provides a commitment duration that corresponds to a comfortably high confidence level for the entire project.



Project tolerances calculated with the RSE method should be considered the absolute minimum values, since the RSE method takes into account only task-level variation.

Further, since the amounts of variation in project duration experienced today by virtually all product development organizations are tremendous, the tolerance values calculated with the RSE method are as inappropriately small as the values calculated by any other practical method. Given this observation, it remains for us to choose the one method that gives us the greatest benefit with the least amount of harm. For me, this is the RSE method, because it gets developers involved in the process of constructing the models of our projects. Developers contribute the two estimates of duration for each task; their contributions enhance trust, rather than destroying trust.

[So that others also might subscribe to The Project Management Soap Box, please share the following link with friends and colleagues.Subscribe!]

Thursday, October 21, 2004

[15] Diamond-Shaped Networks




Frequently enough we encounter a logistical network with a diamond structure to it. This happens when a task in the primary sequence provides inputs to two successors, one of which is in the primary sequence and the second is in a component sequence. This is illustrated in the next figure, which shows the primary sequence in red and the component sequence in blue.



The diamond structure doesn’t appear to be a problem, until we calculate the component tolerance and insert it into our model of the network. When we do this, we end up with a significant gap in the primary sequence of the network. This is illustrated in the next figure.

The gap is created by two factors. First, the size of the component tolerance, which is based on the variation associated with the entire component sequence, is large. Second, the precedence dependency between the last task in the component sequence and its predecessor in the primary sequence prevents the component sequence from moving to the left. Consequently, when we include the component tolerances in our model of the project, we end up with a what appears to be a most discomforting gap in the primary sequence.



The knee-jerk response of most project managers today is to force the gap to vanish. Unfortunately, the resulting model ignores completely the strong interaction between variation and the parallel structure. As such, the resulting model is overwhelmingly wrong; it grossly underestimates the duration of the project; and it misleads project managers and decision-makers into making commitments that cannot be met by the resources of the enterprise. But, the resulting picture is strikingly comforting for those who lack any understanding of variation, despite the fact that accuracy relative to duration is destroyed.



There is a solution, of course. Rather than eliminating the gap arbitrarily, we can move the earlier part of the component sequence to the left. Specifically, we move early the portion of the component sequence that precedes the problem dependency. By doing so, we uncouple most of the component sequence from the diamond-shaped feature of the network, and we diminish the magnitude of the interaction effect. This is shown in the next figure.



However, this tactic does not allow us to eliminate the gap entirely. If we did so, we too would be ignoring the interaction between variation and the parallel structure. Instead, our robust project design tactic lets us reduce the magnitude of the interaction, which we model with a smaller but finite gap. The smaller gap (shown in the next figure) provides a correction factor that at this time we can only estimate.



How should we estimate the magnitude of the correction factor? At this writing, the most practical way to estimate it is simply by calculating a component tolerance for the parallel segments that are involved directly in the diamond-shaped structure. This gives us smaller component tolerances, which in turn create a smaller gap. But we can do this only in cases where we can uncouple the earlier segment of the longer component sequence.

[So that others also might subscribe to The Project Management Soap Box, please share the following link with friends and colleagues. Subscribe!

Tuesday, October 19, 2004

[14] Component Tolerances




It is clear [Variation And The Parallel Structure] that the interaction between variation and the parallel structure of our project models is quite strong. The prediction error caused by ignoring this interaction is significant. We have only two courses of action available to us, if we want useful predictive models. First, we can simply estimate the magnitude of the interaction, thereby including its effect in our predictive models. Second, we can take measures to diminish the magnitude of the interaction effect.

To diminish the magnitude of the interaction effect, component sequences can and should be started earlier. Doing so greatly increases our level of confidence that the outputs of the component sequences will be available for the subsequent assembly task, rather than risking a delay of the assembly task. The degree to which we move early each component sequence is called the component tolerance. This approach is illustrated with our small project, in the next figure.



The first task in each path of the project is an entry task. These, necessarily, are scheduled. Component tolerances tell us how to schedule the start of each entry task. However, this method of scheduling entry tasks is useful if and only if we are working exclusively with a single-project system of resources, i.e. a team of resources that is fully dedicated to a single project and whose team members have no significant commitments to other projects. In a multiproject environment, this method is problematic, as it causes us to schedule the starts of entry tasks as late as possible (ALAP), taking into account the component tolerance. For reasons that are beyond the scope of this writing, the ALAP approach creates problems within a multiproject environment.

Finally, there are other situation where we cannot apply this simple technique directly, even when we are working with a single-project model. We discuss one such situation in the next chapter.

[So that others also might subscribe to Shareholder Value, please share the following link with friends, colleagues, and your boss. Subscribe!]

Sunday, October 17, 2004

[13] Variation & The Parallel Structure




We begin with a simple example of a project model with a parallel structure. This is shown in the next figure. For the sake of simplicity, every task in the model has an expected duration of 10 days and a standard deviation of 5 days. The primary (longest) sequence consists of 8 tasks, the last of which is an assembly task. The assembly task is shown in light yellow. The model also includes two component sequences, the outputs of which are required at the start of the assembly task. While this model is indeed very simple, it is sufficient to demonstrate the strength of the interaction between variation and concurrent sequences, which create the parallel structure.



The next figure shows the partial numerical results of a Monte Carlo analysis. Section (a) of the figure shows the results of the first seven tasks of the primary sequence alone. Notice that the mean duration is 70 days. The variation about that 70-day mean is significant. Sections (b) and (c) of the figure show the partial results for the two component sequences. Notice that each of the component sequences is modeled as starting on day 30 of the Monte Carlo simulation. Each of the component sequences has an average duration of 40 days and a degree of variation that rivals that of the longer sequence. The mean value for the duration, from the start of the project to the completion of the two component sequences, is also 70 days.

Now let’s ask the difficult question. The partial numerical results show that the mean duration from the start of the project to the completion of the 7 task sequence is 70 days. The partial results also show that the mean duration from the start of the project to the completion of the two component sequences is also 70 days. What might most people expect as the mean duration from the start of the project to the start of task no. 8, the assembly task?



Given the current, extremely widespread practice of selecting a commitment duration that matches what looks like the last scheduled day of work, we have to conclude that most executives, who rely on the models crafted by their project managers (as well as the project managers), would expect day 70 to coincide with the mean start time of the assembly task. Most executives and their project managers would be wrong.

To understand just how wrong, consider an even simpler project, which consists of just two tasks in parallel. The project is shown in the next figure. Each task is modeled with a Log-Normal distribution, with a mean duration of 30 days and a standard deviation is 14 days. The mean duration is indicated by the thick blue lines over the histograms. These coincide with the ends of the task bars.



For this even smaller project, the mean duration is certainly not 30 days, as the current widespread practice suggests. Since the two 30 day tasks are in parallel, the project isn’t finished until both tasks are finished. Consequently, the longer of the two tasks always determines the duration of the little project, and the parallel structure acts as a highest-only-pass filter. The result is a two-factor interaction between variation and the parallel structure of the model. The strength of this two-factor interaction is apparent in the next figure, which shows the histogram of project duration, in yellow.



The yellow histogram is the statistical equivalent of the two tasks in parallel. The mean duration indicated by the statistically equivalent representation is 38 days. This corresponds to the mean start time of the subsequent assembly task.

The interaction between variation and the parallel task structure appears to add 8 days to the mean duration of the little project. The mean start of the subsequent assembly task appears to be delayed by 8 days. Of course, in reality there is no delay. It only appears that there is a delay to us, because our expectations have been shaped by an incorrect model, a model that ignores the interaction effect entirely. The expectations of decision-makers are continually shaped by similarly incorrect models of their projects.

Now let’s look at a slightly more realistic model. This time we use a model that has 10 parallel tasks, all with a mean duration of 30 days, just like the tasks in the previous case. The results for the 10 task model are shown below. The upper histogram shows the mean duration and the degree of variation associated with each of the ten tasks. The yellow histogram shows the statistically equivalent representation of the entire project.

The mean duration for the entire project is 56 days, nearly twice the 30 day commitment duration that the current, widespread practice causes decision-makers to specify. In fact, the probability of having the entire project completed within a 30 day interval is less than 1%. Further, the mean duration for the entire project comes with a confidence level of less than 60%, entirely too low for any customer whose millions of dollars may be at risk. To achieve a comfortably high confidence level of, say, 95%, we would need to select a commitment duration of 83 days, nearly three times longer than the duration suggested by the current practice.



As a last step, take a look at how the strength of the interaction between variation and the parallel structure increases, as the number of parallel tasks increases. The next figure shows a parametric curve of the strength of the interaction as a function of the number of tasks in parallel. Notice that the greatest contribution to the mean duration takes place when going from a single task to two tasks in parallel. The increase in the mean duration is 8 days in this case. However, although the additional contributions are not as large as the initial contribution, they are incremental contributions. They all add to the mean duration of the project.



Unfortunately, even when the project managers of today might want to correct the models of their projects, by taking the significant effects of variation into account, they find it exceedingly difficult, because the tools available to project managers make absolutely no provision for this. At this writing, the most widely distributed project management tools provide no means of calculating and including a project tolerance in the models of projects. Consequently, today’s project managers and the executives to whom they report would see only the ten tasks in parallel. They would select a commitment duration of 30 days, given the boneheaded misrepresentation provided by today’s tools and today's widespread practice.

Now, let’s return to our original example. Recall. We have a primary sequence of 8 tasks and two component sequences in parallel with the primary sequence. We are striving to estimate the mean duration, from the start of the project to the start of the assembly task. Of course, we’re interested in the mean duration to the project’s completion as well.

The first histogram in the figure shows the true value of the mean duration to the start of the assembly task. Notice that the true mean is 80 days, not the 70 day interval that the deterministic models of today would have us believe. The mean duration to the project’s completion is 90 days.



Now let’s explore the implications of what we’ve just discussed. Specifically, let’s see the degree to which your organizations and your careers are exposed to risk, by your own deterministic models of projects. The next figure compares the commitment duration based on the deterministic model of our little project. Virtually all your peers would propose a commitment duration of only 80 days for this project. By doing so, they would expose their superiors, their customers, and their careers to an inordinate level of risk, as the lower portion of the figure shows clearly.

The 80 day commitment based on the deterministic model comes with an extremely low confidence level. The probability of completing this little project within an 80 day interval is less than 20%. The corresponding risk to customers and to careers is greater than 80%. In fact, the deterministic 80 day duration equals only the mean duration to the start of the assembly task. Further, the mean duration of the project comes with an unacceptably low confidence level of approximately 50%. Whereas, a committed duration that brings with it a comfortably high confidence level (from the perspective of the customer of the project) extends beyond 110 days.



Next, I would like you to consider the following. An error of this magnitude is created by a deterministic model of a very small project. This simple project has but two component sequences in parallel with the primary sequence of tasks. A real project doesn’t have just two or three parallel sequences. It has a dozen or more. Therefore, when we deal with real projects, the effects of variation are far more pronounced than this simple illustration suggests. The deterministic models of real projects are overwhelmingly wrong, all because they ignore the effects of variation. The risks to which project managers, executives, their businesses, and their customers are exposed are correspondingly large.

Clearly, the effects of variation must not be ignored. Instead, variation must be understood and managed, so that its adverse effects might be diminished. In the subsequent chapters I’ll show you techniques for limiting the adverse effects of variation.

In the meantime, please do me a favor. No, two favors! First, let me know that you're reading this. Since I started this blog I've received almost no feedback, despite the number of subscribers. Second, please help me to recruit more subscribers. One day I hope to publish this as a book. Feedback from interested readers would be most useful. Thank you!

[So that others might also subscribe to Shareholder Value, please share the following link with friends, colleagues, and your boss. :-) Subscribe!]

Wednesday, October 13, 2004

[12] Project Tolerance

The mean duration of a sequence of tasks, often called the expected value of duration, is a valid and useful concept. However, the probability of observing exactly the mean of any distribution is exactly zero. Therefore, we must be careful not to interpret the mean in this deterministic manner. Instead, it is more useful to interpret the mean value merely as a positional parameter for the corresponding distribution, an anchor point about which we can expect considerable variation.

Further, the confidence level associated with the mean value of duration is rarely acceptable in business settings. That confidence level can be as low as 50%. A more useful confidence level might be, say, 95%. The duration that corresponds to the greater confidence level becomes the promised or committed duration. Therefore, we need a new concept, one that helps us bridge the difference between the mean value of duration and the committed duration. I call this the project tolerance.



The project tolerance is a necessary component of any project’s design. It indicates our estimate of the variation in duration, to which the project is exposed. Some refer to this design feature as the project buffer. However, the word buffer brings with it a number of unfortunate connotations, particularly among high-level managers and executives. To these busy people, the word buffer smacks of padding and sandbagging. Frequently their instantaneous response, upon hearing the term buffer, is to mandate that such buffers be removed immediately from the designs of their projects. The inevitable result is that yet another grand lie is fabricated, since the resulting committed duration corresponds to the mean value and brings with it unacceptably high risk.

The term tolerance, however, is a technical term. For example, mechanical engineers use the concept of dimensional tolerancing when specifying the dimensions for components and products. Indeed, even engineers who have embraced the practice of robust product design continue to use the concept of dimensional tolerancing. They do so in an environment where variation in component dimensions at times is imperceptibly small. In the world of projects, where the degree of variation that we encounter is three to four orders of magnitude greater than that encountered in product design, the concept of project tolerance makes even more sense. I would go so far as to say that the project tolerance is an indispensable design component of every robust project model.

With the model of a simple sequence of tasks we have addressed the subject of variation, at least to the extent that variation affects such sequences. We have even taken a first important step toward managing that variation, by defining the concept of project tolerance. However, no real project ever consists of a single sequence of tasks. Real projects always include multiple parallel sequences of tasks. With the next chapter we explore how variation and the parallel structure of project models interact. The strength of their interaction will surprise you.

Monday, October 11, 2004

[11] A Truer Representation

One may wonder why so many executives make project commitments that are overwhelmingly optimistic and associated with exceptionally high risk. Consider the next figure, which shows only the 8-task sequence and the accompanying histogram. The histogram suggests that the common practice of making a commitment for what appears the last scheduled day of work, time T1, leads to unacceptably high risk. The probability of completing the 8-task sequence by time T1 is only 50%. Yet, this is precisely what nearly all project managers end up with today. Why does virtually every executive require this obviously wrong approach to estimating project duration?



The reason is provided by a brief scrutiny of our project management software tools. Except for the rarest of cases, every project plan that is created with today’s project management software tools is displayed as a stack of task bars, much like the task bars shown in the above figure, and the histogram never accompanies the display of task bars. Decision-makers see only the task bars with clearly defined starts and ends. Worse, our software tools go so far as to display dates, which appear to coincide unquestionably with the starts and ends of the task bars. Inevitably, this totally deterministic display of erroneous information fools decision-makers into thinking that they and their project managers are capable of causing events to take place on specific dates. Our models of projects, in other words, are consistently and significantly misleading the decision-makers. Hence, the widespread practice today is for decision-makers, project managers, and at times even customers, to participate in what is nothing short of a grand self-deception.

How should information be displayed instead? Consider the next figure. If any event related to a project can be specified at all, that event is the start of the project. This indeed is a deterministic event, as are the starts of all the various sequences of tasks that one finds in a real project. However, once a sequence of tasks is underway, all subsequent task-start events within the sequence and all subsequent task-finish events within the sequence are entirely unpredictable. An appropriate display for a sequence of tasks should not pretend to predict precise times for these events.

A far truer representation is provided by the next figure, which shows a clearly defined edge only at the left-and end of the sequence. All the other task-start events and task-finish events associated with the sequence are omitted from the representation. Further, the color of the task bars is shown steadily and smoothly fading, from left to right, to indicate the increasing degree of uncertainty in our estimates of the downstream events.



The histogram, too, is depicted in a more appropriate manner. Rather than being displayed in solid colors, varying shades of red indicate the increasing risk associated with increasingly optimistic estimates of duration. The right-hand end of the histogram is shown with vanishingly small levels of red, indicating that this portion of the histogram and the corresponding estimates of duration can be associated with correspondingly low levels of risk.

If decision-makers were provided with this sort representation of their projects, their tendency to favor optimistic, high-risk estimates of duration would be greatly diminished. After all, how many ambitious executives would expose their careers to unnecessary risk?

We see, therefore, that the sadly deficient state of project models today is driven by the grossly deficient tools in widespread use. These tools have been designed and developed by people who lack even a rudimentary understanding of variation. The grossly deficient tools are used regularly by project managers whose understanding of variation is equally lacking. Further, the reports created with the tools consistently mislead decision-makers. The erroneous reports regularly lull decision-makers into a false sense of optimism, which exposes them, their businesses, and their customers to inordinate levels of risk.

Friday, October 01, 2004

[10] Variation & Sequences of Tasks

To understand how variation changes as the number of tasks in a sequence increases, we begin with a computer model of a single task (section a of the figure). The one task is our entire project, initially. We model it as a log-normal distribution, with a mean duration of 10 days and a standard deviation of 5 days, values that are quite realistic for product development organizations today. The figure below shows how variation is affected by the number of tasks in a sequence.

Notice that as the number of tasks in the sequence increases from 1 to 2, 4, and 8, the degree of variation in the duration of the entire sequence increases dramatically. In fact, the degree of variation increases linearly with the square root of the number of tasks. Clearly, to be of any use, our models of projects must take into account variation. Were we to ignore the effects of variation, we would be omitting vastly important pieces of information from our predictive models. Yet, today the most popular project management tools virtually discourage project managers from even attempting to represent variation.

Now, let’s discuss interpretation. How should we interpret the sort of model shown in, say, Section (d) of the figure? Let’s begin by outlining the current, extremely widespread practice. Today, project managers and executives alike glance at the completely deterministic representations of their projects; they identify the so-called last scheduled day of work; and they make a commitment for that deterministic, wrong, and even boneheaded estimate of project duration. By allowing this practice, a project manager pretends that there is only a single value in that magical duration bucket, when, in fact, there is an infinity of values.

The histogram above the representation of eight tasks indicates that the actual duration of the sequence is entirely unpredictable over a very wide range of values. The operative word is unpredictable. This is the effect of variation. It makes it impossible for us to specify precise durations and to make precise commitments, without offering up bold-faced lies to executives and customers alike. At any time before a project is completed, we cannot possibly know the final duration of the project, no matter how emphatically we pretend that we can. So how should we interpret the 8-task model?

We should interpret the model, the histogram, and any desired or target value of duration in terms of our confidence that the sequence might be completed within the desired value of duration. For example, the histogram tells us that the probability of completing the entire sequence in 50 days or less is nearly zero. We know this, because nearly all of the histogram lies to the right of the 50-day mark. Consequently, the expectation of completing the sequence of tasks in 50 days or less comes with a near-zero level of confidence. Conversely, the expectation that the 8-task sequence can be completed within a duration of 110 days comes with an extremely high level of confidence. We know this, because nearly all of the histogram lies to the left of the 110-day mark.

We see, therefore, that we really cannot specify just one value of duration for the sequence of tasks. If it were possible for us to specify a single value of duration, we could display not a histogram but a vertical line at the corresponding value. However, within our very real universe, where variation abounds, this is simply not true to reality. Instead of specifying just a value duration, which implies the complete absence of variation, we and our customers are far better served if we identify the level of confidence that we prefer to maintain. Then, we can identify and communicate the corresponding estimate of duration, which supports our confidence level. In other words, either we specify a desired duration value and a corresponding level of confidence, or we are lying to ourselves and to our customers.

Unfortunately, this is not the widespread practice at this time. Today everyone simply looks at what appears to be the last scheduled day of work, for the project of interest, and makes a commitment for that date. This completely deterministic and equally boneheaded approach ignores completely all forms of variation in task duration and in project duration. Further, the deterministic estimate is itself extremely optimistic in virtually all cases, since all the effects of variation are excluded from the model of the project. Thus, commitments consistently correspond with exceptionally low confidence levels. The risk to customers and to the enterprise is extraordinarily high.