On Friday there was a massive Distributed Denial of Service attack on DynDNS, who provide Domain Name services to a number of major companies including Twitter, Spotify and SoundCloud, effectively knocking those sites offline for a significant fraction of the global population. Brian Krebs provides a useful summary of the attack; he is unusually well versed in these matters because his website "Krebs on Security" was taken offline on 20th September after a massive Internet-of-Things-sourced DDoS against it. It seems that Krebs' ongoing coverage and analysis of DDoS with a focus on the Internet of Things (IoT) - "smart" Internet connected home devices such as babycams and security monitors - raised the ire of those using the IoT for their nefarious purposes. It proved necessary to stick Krebs' blog behind Google's Project Shield which protects major targets of information suppression behind something resembling +5 enchanted DDoS armour.
Where did this threat to the Internet come from? Should we be worried? What can we do? And why is this whole situation a Tragedy of the Commons?
Primer on DNS
Let's look at Friday's outage first. Dyn DNS is a DNS hosting company. They provide an easy way for companies who want a worldwide web presence to distribute information about the addresses of their servers - in pre-Internet terms, they're like a business phone directory. Your company Cat Grooming Inc., which has bought the domain name catgrooming.com, has set up its web servers on Internet addresses 1.2.3.4 and 1.2.3.5, and its mail server on 1.2.4.1. Somehow, when someone types "catgrooming.com" in their internet brower, they need that translating to the right numerical Internet address. For that translation, their browser consults the local Domain Name Service (DNS) server, which might be from their local ISP, or a public one like Google's Public DNS (8.8.4.4 and 8.8.8.8).
So if Cat Grooming wants to change the Internet address of their webservers, they either have to tell every single DNS server of the new address (impractical), or run a special service that every DNS server consults to discover up to date information for the hostnames. Running a dedicated service is expensive, so many companies use a third party to run this dedicated service. Dyn DNS is one such company: you tell them whenever you make an address change, and they update their records, and your domain's information says that Dyn DNS does its address resolution.
To check whether a hostname on the web uses DynDNS, you can use the "dig" command which should work from the Linux, MacOS or FreeBSD command line:
$ dig +short -t NS twitter.com ns3.p34.dynect.net. ns2.p34.dynect.net. ns1.p34.dynect.net. ns4.p34.dynect.net.This shows that twitter.com is using Dyn DNS because it has dynect.net hostnames as its name servers.
Your browser doesn't query Dyn DNS for every twitter.com URL you type. Each result you get back from DNS comes with a "time to live" (TTL) which specifies for how many seconds the answer is valid. If your twitter.com query came back as 199.59.150.7 with a TTL of 3600 then your browser would use that address for the next hour without bothering to check Dyn DNS. Only after 1 hour (3600 seconds) would it re-check Dyn DNS for an update.
Attack mechanism
The Internet of Things includes devices such as "babycams" which enable neurotic parents to keep an eye on their child's activities from elsewhere in the house, or even from the restaurant to which they have sneaked out for a couple of hours of eating that does not involve thrown or barfed food. The easiest way to make these devices accessible from the public Internet is to give them their own Internet address, so you can enter that address on a mobile phone or whatever and connect to the device. Of course, the device will challenge any new connection attempt for a username and password; however, many devices have extremely stupid default passwords and most users won't bother to change them.
Over the past decade, Internet criminals have become very good at scanning large swathes of the Internet to find devices with certain characteristics - unpatched Windows 2000 machines, webcams, SQL servers etc. That lets them find candidate IoT devices on which they can focus automated break-in attempts. If you can get past the password protection for these devices, you can generally make them do anything you want. The typical approach is to add code that makes them periodically query a central command-and-control server for instructions; those instructions might be "hit this service with queries randomly selected from this list, at a rate of one query every 1-2 seconds, for the next 4 hours."
The real problem with this kind of attack is that it's very hard to fix. You have to change each individual device to block out the attackers - there's generally no way to force a reset of passwords to all devices from a given manufacturer. The manufacturer has no real incentive to do this since it has the customer's money already and isn't obviously legally liable for the behavior. The owner has no real incentive to do this because this device compromise doesn't normally materially affect the device operation. You can try to sell the benefits of a password fix - "random strangers on the internet can see your baby!" but even then the technical steps to fix a password may be too tedious or poorly explained for the owner to action. ISPs might be able to detect compromised devices by their network traffic patterns and notify their owners, but if they chase them to fix the devices too aggressively then they might piss off the owners enough to move to a different ISP.
Why don't ISPs pre-emptively fix devices if they find compromised devices on their network? Generally, because they have no safe harbour for this remedial work - they could be prosecuted for illegal access to devices. They might survive in court after spending lots of money on lawyers, but why take the risk?
Effects of the attack
Dyn DNS was effectively knocked off the Internet for many hours. Any website using Dyn DNS for their name servers saw incoming traffic drop off as users' cached addresses from DNS expired and their browsers insisted on getting an up-to-date address - which was not available, because the Dyn DNS servers were melting.
Basic remediation for sites in this situation is to increase the Time-to-Live setting on their DNS records. If Cat Grooming Inc's previous setting was 3600 seconds, then after 1 hour of the Dyn DNS servers being down their traffic would be nearly zero. If their TTL was 86400 seconds (1 day) then a 12 hour attack would only block about half their traffic - not great, but bearable. A TTL of 1 week would mean that a 12 hour attack would be no more than an annoyance. Unfortunately, if the attack downs Dyn DNS before site owners can update their TTL this doesn't really help.
Also, the bigger a site is, the more frequently it needs to update DNS information. Twitter will serve different Internet addresses for twitter.com to users in different countries, trying to point users to the closest Twitter server to them. You don't want a user in Paris pointed to a Twitter server in San Francisco if there is one available in Amsterdam, 500 millseconds closer to them. And when you have many different servers, every day some of them are going offline for maintenance or coming online as new servers, so you need to update DNS to stop users going to the former and start sending them to the latter.
Therefore the bigger your site, the shorter your DNS TTL is likely to be, and the more vulnerable you are to this attack. If you're a small site with infrequent DNS updates, and your DNS TTL is short, then make it longer right the hell now.
Alternative designs
The alternative to this exposed address approach is to have a central service which all the baby monitors from a given manufacturer connect to, e.g. the hostname cams.babycamsRus.com; users then connect to that service as well and the service does the switching to connect Mr. and Mrs. Smith to the babycam chez Smith. This prevents the devices from being found by Internet scans - they don't have their own Internet address, and don't accept outside connections. If you can crack the BabyCams-R-Us servers then you could completely control a huge chunk of IoT devices, but their sysadmins will be specifically looking out for these attacks and it's a much more tricky proposition - it's also easy to remediate once discovered.
Why doesn't every manufacturer do this, if it's more secure? Simply, it's more expensive. You have to set up this central service, capable of servicing all your sold devices at once, and keep it running and secure for many years. In a keenly price-competitive environment, many manufacturers will say "screw this" and go for the cheaper alternative. They have no economic reason not to, no-one is (yet) prosecuting them for selling insecure devices, and customers still prefer cheap over secure.
IPv6 will make things worse
One brake on this run-away cheap-webcams-as-DoS-tool is the shortage of Internet addresses. When the Internet addressing scheme (Internet Protocol version 4, or IPv4 for short) was devised, it was defined as four numbers between 0 and 255, conventionally separated by dots e.g. 1.2.3.4. This gives you just under 4.3 billion possible addresses. Back in 2006 large chunks of this address space were free. This is no longer the case - we are, in essence, out of IPv4 addresses, and there's an active trade in them from companies which are no longer using much of their allocated space. Still, getting large blocks of contiguous addresses is challenging. Even a /24 (shorthand for 256 contiguous IPv4) is expensive to obtain. Father of the Internet Vint Cerf recently apologised for the (relatively) small number of IPv4 addresses - they thought 4.3 billion addresses would be enough for the "experiment" that IPv4 was. The experiment turned into the Internet. Oops.
This shortage means that the current model where webcams and other IoT devices have their own public Internet address is unsustainable: the cost of that address will become prohibitive, and customers will need something that sits behind their single home Internet address given to them by their ISP. You can have many devices behind one address via a mechanism called Network Address Translation NAT) where the router connecting your home to the Internet lets each of your devices start connections to the Internet and allocates them a "port" which is passed to the website they connect to: when the website server responds, it sends the web page back to your router along with the port number, so the router knows which of your home devices the web page should be sent to.
The centralized service described above is (currently) the only practical solution in this case of one IP for many devices. More and more devices on the Internet will be hidden from black-hat hacker access in this way.
Unfortunately (for this problem) we are currently transitioning to use the next generation of Internet addressing - IPv6. This uses 128 bits, which is a staggering number: 340 with an additional 36 zeroes after it. Typically your ISP would give you a "/64" for your home devices to use for their public Internet addresses - a mere 18,000,000,000,000,000,000 (18 quintillion) addresses. Since there are 18 quintillion /64s in the IPv6 address space, we're unlikely to run out of them for a while even if ever person on earth is given a fresh one every day and there's no re-use.
IPv6 use is not yet mainstream, but more and more first world ISPs are giving customers IPv6 access if they want it. Give it a couple of years and I suspect high-end IoT devices will be explicitly targeted at home IPv6 setups.
Summary: we're screwed
IPv4 pressures may temporarily push IoT manufacturers to move away from publicly addressable IoT devices, but as IPv6 becomes more widely used the commercial pressures may once more become too strong to resist and the IoT devices will be publicly discoverable and crackable once more. Absent a serious improvement in secure, reliable and easy dynamic updates to these devices, the IoT botnet is here to stay for a while.