Tuesday, November 12, 2013

One more IT horror story: HealthCare.gov


It’s hard to watch the train wreck going on at the HealthCare.gov web site and not be flabbergasted by how such a big, important project could have gone so wrong. Whatever our politics are regarding the Affordable Care Act (and they’re not relevant for the sake of this discussion) we all have to wonder what happened, and perhaps what lessons the IT profession can draw from the spectacular and highly visible shortcomings of this effort.

Despite all the coverage that’s been expended on the issues, we know much more about the symptoms of the problem than we do about the root cause. We know that the heavy load of users that was put on the site when it opened on October 1st apparently overwhelmed the system. The web site has been slow to the point of uselessness (we all know what that feels like), and users have been met with many error messages or lockups even when just trying to register, let alone go through the whole process of buying a health care policy.

From this we could speculate that capacity was not well estimated or built for; this is disconcerting since it was not impossible to determine the number of Americans who would need to make use of the site. Given that failing to do so would result in a Federal tax penalty (an assessment of up to 1% of your taxable income next year), the incentive to sign up was pretty clear. Businesses do a better job of estimating customer response with a lot less to go on than that.

HealthCare.gov also has some software flaws, although the number and severity have not been disclosed. Users get error messages while using the site, and the insurance companies that connect to the system have said that the data transfer they get out of the exchange is coming out misformatted and useless.

It’s also easy to assume (and in this case justifiably so) that testing on several levels was woefully lacking. Volume testing and user testing could not have been done adequately.

This had to have been one of the most publicized and anticipated web site rollouts ever. Everyone was watching. In the highly politicized political environment surrounding Obamacare, everyone knew that the opposition would pounce on the slightest glitch. The political fallout was sure to be major – honestly when the president of the United States has to apologize to the nation for the failure of an IT project, the ramifications have got to be off the charts. The team assigned to this project undoubtedly wanted to get it right and felt plenty of pressure to do so. Yet it all went wrong.

This is not the first IT project to go off the rails in spectacular fashion of course. All IT pros should know of some of the most famous failures in our profession, and should be wary of being a party to another. The Denver airport baggage handling fiasco  was one such, a ten year effort and millions of dollars down the drain before being cancelled: the root causes were an impossible design and a never-ending flow of scope-creep. Or the Therac-25 story, grimly considered one of the worst technology disasters ever, in which a poorly designed user interface – and a refusal of the manufacturer to accept accountability – resulted in an X-ray therapy machine that severely maimed dozens of patients and killed four.

I doubt if we’ll ever uncover a smoking gun in the HealthCare.gov case, though some may be tempted to search for one. To the many missteps that must have occurred both technically and in terms of project management, I would add the inflexible nature of the deadline – with so many watching, no one in the administration was going to postpone the October 1 startup date – which gives no room for extra testing, or rework as a result of test findings. This is a clear danger signal in any IT project.

So whether you’re feeling the pain of your fellow IT colleagues as they race to get HealthCare back on track, or you’re just savoring the schadenfreude, we in IT have another horror story to remember, and another lesson to learn from, if we can.

No comments: