Wednesday, November 25, 2009

Failure Is Not An Option

A recent article in SD Times(1) well highlighted some alarming software project failures. For example, among the 68% of "failures" was the problem with the laser-guided missile having software not designed for battery hot-swapping and producing the interesting result of resetting the coordinates to origin. Oops! This failure would give new meaning to your boss going ballistic on discovering the error! The Standish Group's 2009 "Chaos" study also included in the 68%, failures due to late completion, over-budget completion or simply non-completion. Completion doesn't always mean success of course. Only after spending $100 million did the FBI discover that the Virtual Case File system was "... not something that we want." How can this happen? Better to ask, how does this continue to happen? If you had Microsoft's money and experience, how could you fail to produce perfect software? Well, they did fail fantastically with Vista which Scott Rosenberg explains was due to "conflicting ambitions and too few resource constraints" leading to an "organizational breakdown". Well that is enormously helpful in explaining my near breakdown trying to tame the Universal Account Control (UAC) which repeatedly insisted I was not to be trusted using my own computer.

Frustration as an end product is certainly more desirable than loss of life under an X-Ray machine or crashing an Airbus because the computer simply 'wouldn't let go of the controls'. I imagine the software developers felt they could stop the pilot and co-pilot from fighting over who did what if they just let the computer decide. Well, the computer couldn’t rise above the occasion and crashed the trees into the plane – according to its flighty calculations.

So far to go. "There are two ways to write error-free programs; only the third one works" said one expert(2). But we will continue to rise to the challenge of taming complex problems into elegant software solutions. The amazing examples David Worthington gives us should definitely caution us to work with care when using our wide sweep of imaginative powers coupled with mathematical precision to produce the best software we can.

(1) David Worthington, Software Development Times, November 1, 2009. www.sdtimes.com
(2) From ACM's SIGPLAN publication, (September, 1982), Article "Epigrams in Programming", by Alan J. Perlis of Yale

4 comments:

Unknown said...

Robin, you make excellent points!

Huge budgets and long time schedules (e.g. Vista) don't guarantee success on a software project. Having life and death on the line (e.g. Airbus) doesn't either. Despite this, we will continue to rise to the occasion....

But I look forward to the day when we view software engineering more like we view commercial aircraft engineering.

If Boeing or Airbus created new aircraft the way most software projects are done, no one would fly--out of sheer terror! The plane would be on the runway being gassed up, before even defining all the bolts and screws, let alone ensuring they were all installed. We can catch that in Test, right?

And it is not a failure of the programmers. There is enormous pressure for coding to start before adequate definition can possibly be done. Little wonder then, that dismal 68% failure rate. Only a healthy dose of genius is keeping it from being worse still.

Robin Humberstone said...

Dan,

Appreciated your comments, especially as someone who has conscientiously worked in the field for a number of years now.

As you say, there is pressure to produce results fast. The design of software is the product whereas in other products the design would only constitute the first or an early stage. This, of course makes this phase so much more important in the development of software. Prototypes and mockups get us some way in presenting a preview of a finished product, but an important challenge remains.

Unknown said...

This is a very nice article Robin. I look forward to the next one.

It seems to me that as the layers of complexity and abstraction increase while requirements and feature demands also increase, a void develops. This space is directly related to the opportunity for failure.

Technology that is used in multiple avenue may server them all but may not do so at an effective or efficient level. To avoid failure - it is often necessary for a ground up approach where tools and features are developed in such a way as to meet the needs of that specific situation, as un"one-size-fits-all" as that may be...

Unknown said...

Robin,

Your mention of prototypes and mockups raises an important point.

My aircraft metaphor works (sometimes) because we have built aircraft before and know a great deal about what I'll call the "design-space." In those areas where the design-space is well known, I believe that software engineering can be greatly improved by adhering to standard engineering disciplines, as I alluded to.

However, often in software design the problem is unique...an area never explored before. Then, you simply can't sit down and write out a requirements document followed by a crisp specification document, then code to them. You need to use prototypes, you need to do mockups.

The traditional-engineering parallel would be that of the first airplane. You make a little wind-tunnel and do some tests. You make a model. And eventually you have to do like the Wright brothers did and make an airplane to see if you can make it fly!

No team on earth could at that time have created a specification for a Boeing 777.

We learn from what we've successfully done in the past.