The joys of hard drive death

The IRS (US tax service) ex-head Lois Lerner has been under the spotlight in the past year about the IRS allegedly targeting organisations for audit based on their political allegiances. Apparently Tea Party related organisations were much more likely to be targeted than left-leaning organisation. Lerner retired from the agency in September last year, but the Republican party has unsurprisingly been chasing her. Lerner took the 5th at a hearing in March, refusing to testify to avoid the risk of incriminating herself, so the investigators have been looking for other sources of information.

Of course, most communication these days is done by email, and the IRS is no exception. The obvious place to start in finding the details of Lerner's involvement - if any - would be to trawl her email. Except that this appears to be difficult:

Today, Ways and Means Committee Chairman Dave Camp (R-MI) issued the following statement regarding the Internal Revenue Service informing the Committee that they have lost Lois Lerner emails from a period of January 2009 – April 2011. Due to a supposed computer crash, the agency only has Lerner emails to and from other IRS employees during this time frame.
Oopsie. Still, these things happen occasionally. It's just bad luck, right?

The IRS has 89,500 employees. It's not unreasonable to estimate that every one of them has an email account, and most of them have a computer. Say they have 70,000 personal computers on their network. Every computer has at least one hard drive. A hard drive's average life is 2-3 years; let's say 1000 days. On average, if you have 1000 hard drives, one will be failing each day. In the case of the IRS we'd expect to see 70 hard drives a day, nearly 500 per week, failing. Hard drives failing are a completely normal part of IRS IT operations.

Given that, you put together an IT system that lets your executives lose all their emails whenever their personal computer hard drive crashes? This seems... not the approach one would normally take.

What I find interesting is an IRS note from 1998 announcing that they were standardising on Exchange:

The new e-mail package will use Microsoft Exchange Server Version 5.5 along with the Microsoft Outlook 98 desktop product. The IRS will switch over to the new system during the next 12 months
I'm assuming that by now they've done several migrations to more modern versions of Exchange. By 2009 they should have been on Exchange 2003 at least, maybe 2007. A user's emails would be in folders on replicated central storage, not just on a personal machine; the Outlook client would copy mails from the central storage to the local computer for speed and ease of access, but they would remain in the central storage precisely because personal computers fail all the time. Suppose the power supply exploded, or the motherboard shorted, or coffee spilled into the CD-ROM drive slot, or the user has to get email access out of office hours (e.g. via Outlook Web Access) - there has to be a way to get to their data when the PC is not available. The replicated storage copies the data to several physically separate machines, using a scheme such as RAID which lets you trade off the number of copies of data, read performance and write performance.

What I would believe, and I should make it clear that this is pure speculation, is that someone was deleting old emails off the replicated storage for some purpose; perhaps for perfectly legitimate purposes. They ended up deleting much more than they expected. Once this was discovered, they tried to recover the data from the daily / weekly tape backups that were almost certainly being made from the central storage. When they did this, they discovered that for the past 1-2 years the backup data being written wasn't being written correctly - taken from the wrong source, missing indexes, taken from a source that was being updated as it was being read, whatever. This was so embarrassing given the amount of money that they were spending on their storage and backups that they cooked up a story about a hard drive failing and hoped no-one would ask any inconvenient questions. Bad luck, boys!

If the details of IRS's excuse haven't been mis-reported - a possibility we should not reject out of hand - then either they have a painfully badly assembled and operated IT system, or someone is telling pork pies.

No comments:

Post a Comment

All comments are subject to retrospective moderation. I will only reject spam, gratuitous abuse, and wilful stupidity.