November 29, 2009

ClimateGate Software. Garbage In, Garbage Out, and a Garbage Disposal in the Middle

Illustration: Shannon Love

Shannon Love at Chicago Boyz has written a superlative examination of just how bad "the most important computer programs in the world" at Hadly CRU actually are in Scientists Are Not Software Engineers. While the human element of deception, deletion, and good old fashion "dumbth" are on display in the hacked emails, Love points out

No, the real shocking revelation lies in the computer code and data that were dumped along with the emails. Arguably, these are the most important computer programs in the world. These programs generate the data that is used to create the climate models which purport to show an inevitable catastrophic warming caused by human activity. It is on the basis of these programs that we are supposed to massively reengineer the entire planetary economy and technology base.

The dumped files revealed that those critical programs are complete and utter train wrecks.

Love does not believe that these programs are bad out of malice but out of sheer and escalating incompetence when it comes to software design and a factor that creates a mindset NASA has termed "the normalization of deviancy/risk."
CRU isn't that unusual. Few scientific teams use any kind of formal software-project management. Why? Well, most people doing scientific programming are not educated as programmers. They're usually just scientists who taught themselves programming. Moreover, most custom-written scientific software, no matter how large, doesn't begin as a big, planned project. Instead the software evolves from some small, simple programs written to handle relatively trivial tasks. After a small program proves useful, the scientist finds another related processing task so instead of rewriting everything from scratch, he bolts that new task onto an existing program. Then he does it again and again and again....

Most people who use spreadsheets a lot have seen this process firsthand. You start with a simple one sheet spreadsheet with a few dozen cells. It's small, quick and useful but then you discover another calculation that requires the data in the sheet so you add that new calculation and any necessary data to the initial sheet. Then you add another and another. Over the years, you end up with a gargantuan monster. It's not uncommon for systems analysts brought in to overhaul a company's information technology to find that some critical node in the system is a gigantic, byzantine spreadsheet that only one person knows how to use and which started life as a now long-dead manager's to-do list.

Of course, along with general ineptitude catalogued by Love, there can also be the all too human factors of "keeping one's job" and "feathering one's nest."

I encountered this sort of homespun software garbage when I took over management of the Cosmodemonic PornGiant site for the House of Pent in 2000. At that time the site was generating something on the order of $25 million per year in revenue and almost all of the House of Pent's cash flow. It was the cash cow that cut the payroll checks and kept the ever-ravenous creditors from the door.

Cosmodemonic PornGiant was the single most critical element of the business and all the money flowed out of the computers and through a program in a computer locked in a small back office with special locks. The program was written and run by one very irritable "employee" who spent a lot of time locked in that room, lifting weights, and sending out emails stating what the income from the day before had actually been -- according to "his computer." Since he'd written the "payments" program himself he was as close to invulnerable an employee as there was. In addition you were never quite sure if all the funds entering the program he'd written were coming out the other side or if some were escaping via a rogue data link elsewhere. But because the flow of the funds was critical nobody every wanted to look at that too hard. The fear was that the flow could presumably be made to flow elsewhere with a keystroke and disappear until it could be found and directed to the company coffers once again.

This employee knew that he held the whip hand in the company and developed, over time, an attitude. Not a good situation for his boss who happened to be, well, me. Since I was the one in the organization chart charged with keeping the company's single oxygen line flowing I took this job seriously lest my paycheck and that of 200+ other families simply evaporate into cyberspace.

Ultimately, after a byzantine bit of internal corporate espionage involving giving him and his wife a "bonus" all-expense-paid vacation in Bermuda, I was able to gain control of this man's "Black Box" programs and soon after, following a rage-filled encounter, he left the company "to spend more time with his family."

The programs, when my outside programmers began to understand them and poke into their innards, were much like Love's programs that involved "a gigantic, byzantine spreadsheet that only one person knows how to use and which started life as a now long-dead manager's to-do list." Only these programs could dial into ours and who knows who else's bank accounts with deposits. I'm not saying that any money that belonged to the company went to any other account. I'm saying something scarier. I'm saying that the company wouldn't have known if it did.

Which is a bit like what seems to have been going on at Hadly CRU. It might be right. It might be wrong. It might be so wrong it isn't even wrong. It might be so right that the globe could ignite at dawn.

All we know is that we do not know and that now, given that all the original data for the Hadly CR has been "lost," we don't even know what we don't know.

As Eric Raymond noted this evening at Armed and Dangerous's "Facts to fit the theory? Actually, no facts at all!

It just keeps getting better and better. Now we learn that the CRU has admitted to throwing away the primary data on which their climate models were based. I quote: "We do not hold the original raw data but only the value-added (quality controlled and homogenised) data."

This means that even the CRU itself has no idea how accidentally corrupt or fraudulently altered its data might be. And the IPCC reports used the CRU'€™s temperature reconstructions as a gold standard. So did other climatologists all over the world. And now they can'€™t be verified! Without a chain of provenance tieing them back to actual measurements, every single figure and trendline in the CRU reconstructions might as well be PDOOMA, a fine old engineering acronym expanding to ""€œPulled Directly Out Of My Ass".

It's outrageous and immoral. It's incompentant and malicious at the same time. Heads should roll but heads won't roll.

For a clusterfuck this size in this world of insane experts and electible conmen, there's only one thing that's fit to give the aging hippie fuckups of Hadly CRU: The Nobel Prize. Al Gore move over. Always room for more bozos on your bus.

Posted by Vanderleun at November 29, 2009 4:33 PM
With the revelation that these files were sent to the BBC a month ago, but the Beeb sat on them, it is impossible not to wonder who else got them and sat on them.

If I wanted to get this travesty exposed, I would send it to multiple potential outlets, and let each know that they must act quickly or be scooped.

If that was done here, then there must either have been some collusion to suppress the information, or each outlet would assume that all the others would sit tight. Bad situation either way, isn't it?

There must be absolutely horrifying things that the media hides from us while they report on the fascinating exploits and insights of Sarah Palin's almost-son-in-law.

Posted by: sherlock at November 29, 2009 5:42 PM

More Bozos on this bus?

More like:

Dogs flew spaceships! The South won the Civil War! Our forefathers took drugs! That's right-

Everything You Know is Wrong!

Thanks for the funny essay, Mr Van der Luen.

Posted by: David at November 29, 2009 6:03 PM

The description of how the software developed brings this image to mind.

Posted by: Julie at November 29, 2009 8:29 PM

Been there, done that.

Until about 1980 code and data storage at any institution was probably an utter mess.

The staff usually could not have reproduced what they did five years before if threatened with castration.

By 1990 nearly all professionals had a good grasp of the principles of data and program retention. But not too many places really ran a clean house. You were rewarded for new, useful stuff. The bonuses weren't paid for having massive and neat data vaults and good records. And budgeting big money for maintaining them didn't produce smiley faces.

By 2000 I was further from the scene. Still, I know matters had improved a lot simply because staff better understood the need for procedures. Also, storage media was costing less each year.

But grad students and scientists are not trained or rewarded for information retention. They are task oriented rather than institution oriented.

Once their task is done and accepted they proceed to other matters. Precisely what they did and how is no longer of top importance.

Thus we have the present mess. I am rather sure that the rascals at CRU are utterly confused and don't, themselves, know how to recreate their work for the last two or three decades.

Posted by: K at November 29, 2009 11:58 PM

How's this for scary bozos?

WUWT has been reporting that Google searches for Climategate fail to bring it up in the suggestion box as you type the word.

After Anthony initially reported this - despite the gazillions of entries made for Climategate - it suddenly started to appear - then hours later disappeared.

Last I checked - it's still gone - nowhere in the suggestion list - and the word that comes up is Climateguard.

Bing has Climategate as its first suggestion.

Interesting times.

Posted by: Cathy at November 30, 2009 7:32 AM

CRU should change its name to the Global Information-Gathering Organization, or GIGO.

Posted by: Jim Treacher at November 30, 2009 7:45 AM

That search suggestion box observation is interesting. Let's let it run for a day or so and see what happens.

Posted by: vanderleun at November 30, 2009 7:53 AM

Climategate=Fudd's First Law of Opposition
If you push something hard enough, it will fall over.

Posted by: David C McKinnis at November 30, 2009 8:16 AM

Sherlock -- "With the revelation that these files were sent to the BBC a month ago, but the Beeb sat on them, it is impossible not to wonder who else got them and sat on them."

I think you're missing the point here: these were specific CRU emails, pertaining to Hudson's heretical, by BBC standards, blog post on "Whatever Happened To Global Warming" on Oct. 9.

Suffice it to say these CRU missives were not the type anyone would want to see published, as evidenced by the US very own Michael Mann's comments:

"extremely disappointing to see something like this appear on BBC. its particularly odd, since climate is usually Richard Black's beat at BBC (and he does a great job). from what I can tell, this guy was formerly a weather person at the Met Office. We may do something about this on RealClimate, but meanwhile it might be appropriate for the Met Office to have a say about this, I might ask Richard Black what's up here?"

Translation: Watch out for your job, Mr. Hudson. (Predictably, the BBC has now silenced Hudson.)

Was this a warning shot across the bow? Perhaps, as regards the BBC, the most prominent standard bearer for AGW hysteria.  For other sources? I doubt it.

It does, however, bring up the question of who exactly released these files, since CRU email comments regarding Hudson's heresy end on Oct. 14, but the most current email of the "FOIA.ZIP" is dated Nov. 12, 2009.

Posted by: JBean at November 30, 2009 10:44 AM

There once was an old Dilbert strip about this very topic: Dilbert and Wally are eating lunch with a co-worker who tells them that he wrote the company's payroll program, 100,000 lines of undocumented spaghetti code. Dilbert exclaims that it is the holy grail of technology, and the coworker smiles and says that they might find some extra in their next paychecks.

Posted by: Ilkka Kokkarinen at November 30, 2009 12:32 PM

CRU should change its name to the Climate Research Unit for Disinformation. CRUD.

Posted by: reliapundit at November 30, 2009 8:09 PM

The benefit of having project management software is that:It efficiently manages your projects,Tracks and logs your time, Generates custom reports and much more, Improves collaboration amongst the team.

Posted by: Online Project Management Software at December 1, 2009 2:32 AM

[I'm sorry try to type faster. It's all grist to me.]

Posted by: Vanderleun at December 2, 2009 9:45 AM