2014-12-16

The 2038 problem

I was inspired - perhaps that's not quite the right word - by this article on the Year 2038 bug in the Daily Mail:

Will computers be wiped out on 19 January 2038? Outdated PC systems will not be able to cope with time and date, experts warn Psy's Gangnam Style was recently viewed so many times on YouTube that the site had to upgrade the way figures are shown on the site.
  1. The site 'broke' because it runs on a 32-bit system, which uses four-bytes
  2. These systems can only handle a finite number of binary digits
  3. A four-byte format assumes time began on 1 January, 1970, at 12:00:00
  4. At 03:14:07 UTC on Tuesday, 19 January 2038, the maximum number of seconds that a 32-bit system can handle will have passed since this date
  5. This will cause computers to run negative numbers, and dates [sic]
  6. Anomaly could cause software to crash and computers to be wiped out
I've numbered the points for ease of reference. Let's explain to author Victoria Woollaston (Deputy Science and Technology editor) where she went wrong. The starting axiom is that you can represent 4,294,967,296 distinct numbers with 32 binary digits of information.

1. YouTube didn't (as far as I can see) "break".

Here's the original YouTube post on the event on Dec 1st:

We never thought a video would be watched in numbers greater than a 32-bit integer (=2,147,483,647 views), but that was before we met PSY. "Gangnam Style" has been viewed so many times we had to upgrade to a 64-bit integer (9,223,372,036,854,775,808)!
When they say "integer" they mean it in the correct mathematical sense: a whole number which may be negative, 0 or positive. Although 32 bits can represent 4bn+ numbers as noted above, if you need to represent negative numbers as well as positive then you need to reserve one of those bits to represent that information (all readers about to comment about two's complement representation can save themselves the effort, the difference isn't material.) That leaves you just over 2bn positive and 2bn negative numbers. It's a little bit surprising that they chose to use integers rather than unsigned (natural) numbers as negative view counts don't make sense but hey, whatever.
Presumably they saw Gangnam Style reach 2 billion views and decided to pre-emptively upgrade their views field from signed 32 bit to signed 64 bit. This is likely not a trivial change - if you're using a regular database, you'd do it via a schema change that requires reprocessing the entire database, and I'd guess that YouTube's database is quite big but it seemed to be in place by the time we hit the signed 32 bit integer limit.

2. All systems can only handle a finite number of binary digits.

For fuck's sake. We don't have infinite storage anywhere in the world. The problem is that the finite number of binary digits (32) in 4-byte representation is too small. 8 byte representation has twice the number of binary digits (64, which is still finite) and so can represent many more numbers.

3. The number of bytes has no relationship to the information it represents.

Unix computers (Linux, BSD, OS X etc.) represent time as seconds since the epoch. The epoch is defined as 00:00:00 Coordinated Universal Time (UTC - for most purposes, the same as GMT), Thursday, 1 January 1970. The Unix standard was to count those seconds in a 32 bit signed integer. Now it's clear that 03:14:08 UTC on 19 January 2038 will see that number of seconds exceed what can be stored in a 32 bit signed integer, and the counter will wrap around to a negative number. What happens then is anyone's guess and very application dependent, but it's probably not good.
There is a move towards 64-bit computing in the Unix world, which will include migration of these time representations to 64 bit. Because this move is happening now, we have 23 years to complete it before we reach our Armageddon date. I don't expect there to be many 32 bit systems left operating by then - their memory will be rotted, their disk drives stuck. Only emulated systems will be still working, and everyone knows about the 2038 problem.

4. Basically correct, if grammatically poor

5. Who taught you English, headline writer?

As noted above, what will actually happen on the date in question is heavily dependent on how each program using the information behaves. The most likely result is a crash of some form, but you might see corruption of data before that happens. It won't be good. Luckily it's easy to test programs by just advancing the clock forwards and seeing what happens when the time ticks over. Don't try this on a live system, however.

6. Software crash, sure. Computer being "wiped out"? Unlikely

I can see certain circumstances where a negative date could cause a hard drive to be wiped, but I'd expect it to be more common for hard drives to be filled up - if a janitor process is cleaning up old files, it'll look for files with modification time below a certain value (say, all files older than 5 minutes ago). Files created before the positive-to-negative date point won't be cleaned up by janitors running after that point. So we leave those stale files lying around, but files created after that will still be eligible for clean-up - they have a negative time which is less than the janitor's negative measurement point.

I'm sure there will be date-related breakage as we approach 2038 - if a bank system managers 10 year bonds, then we will see breakage as their expiry time goes past january 2038, so the bank will see breakage in 2028. But hey, companies are already selling 50 year bonds so bank systems have had to deal with this problem already.

Thank goodness that I can rely on the Daily Mail journalists' expertise in all the articles that I don't actually know anything about.

No comments:

Post a Comment

All comments are subject to retrospective moderation. I will only reject spam, gratuitous abuse, and wilful stupidity.