img

Wednesday, August 24, 2011

How to take over the Internet.

Updated: 2012-01-16

This started as curiosity.

I was getting annoyed at how long my broker's website took to put up their webpages. Waiting 15 minutes for the trading page to come up when the market is diving along with your naked long position is not fun. So later on I decided to look at what they were doing to support their website. I found they were using Akamai for their public pages, and their own servers for users private data. But I also noticed that regularly connections would be made to unrelated addresses. Reverse lookups on these addresses pointed me at other webfarm sites, Linode, and yes there were also attempts to access net 127.0.0.0 and 10.0.0.0. It was from the web browser, but where in the huge mass of Javascript and HTML? So I blocked every IP except in Akamai (that is tough - every day, and sometimes more than once a day I discovered a new range of Akamai addresses!) and the broker's space and watched to see what broke.

First, the website completely broke. For security my broker uses HTTPS for all connections, and the certification process needs to visit the certification chain to make sure no certification has been revoked. Right there I have a huge source of my broker's problem with speed - when the market dives everybody is pinging the broker, Akamai, and the certificate providers. As is typical these days, the webpage has hundreds of components, and each is an HTTPS access with the necessary encryption and decryption steps. Also HTTPS access shuts off caching at least in the intermediate proxies, and sometimes even in your browser. Worse is the certificate revocation check. Of course I could tell my browser not to check for certificate revocation, but I am a little anal about that - why would you turn off an important element of security? Then I noticed another little problem - the IE settings to "Check for certificate address mismatch" is turned off! This is like a border guard using a lie detector on an immigrant on a flight from Great Britain and letting him go by because he is telling the truth, but not bothering to ensure that what he is saying is not something like "I am a terrorist from Afghanistan and I am here to blow up the Pentagon"! The Firefox settings say to "Validate a certificate if it specifies an OCSP server" and also do not require the certificate to be treated as invalid if the connection to the OCSP server fails. Two problems here - the certificate could point to an unreliable or colluding OCSP server, and you get a pass if a reliable server is unreachable for some reason. The second is probably a worthwhile risk for most people, but the first is really a security hole that can be exploited fairly easily.

So what is the point of all this about HTTPS? The point is that even with HTTPS in use, with fairly common settings on the browser it is possible to pretend to be another server. The use of web farms and server sharing between websites makes it impossible to use reverse DNS lookups as a reliable guide, but using HTTPS does not work for most web clients either. Furthermore, on HTTPS connections Earthlink (and others) offer a certificate with themselves listed as the certificate authority. If you accept them as a certificate authority and trust signed controls, they can do whatever they want on your system. Even if you dont trust downloaded controls, this is particularly noxious, given that one of the benefits of using HTTPS is defense against man-in-the-middle attacks, which is completely destroyed by this tactic. The contretemps with Diginotar show you how trust can be abused - they allowed issuance of Google certificates to sites not controlled by Google, and BEAST is an example of how the basic encryption mechanism can be compromised.

If your messages get sent to the wrong server, you get whatever pages the attacker wants to send you.

Back to the broker's system. I add back the certificate provider IPs (the ones I believe in anyway!) and find that their system still breaks now and then. Images are sometimes missing, and pages sometimes dont load or are garbled. Also, I turn off Javascript except for trusted sites (my broker!), and my browser wants permission 5 times to enable Javascript. I track down the garbling (is that a word?) to missing CSS files that specify how the page is to be laid out and styled. All these files are on Akamai as far as I can tell. The script warnings are coming from some code from a chart provider, who also seems to provide an image of one set of tabs. Something else happens as I find and add back sites - every now and then a really weird image of a guy and a gal sitting on a bed crops up where a button or a chart ought to be [1]. Not always, but every once in a while. Typically the image has the top half a real picture, and the bottom portion has colored snow. I pick a specific realtime chart (people tend to reload these a lot because they want to see up-to-date charts!) that has this problem and find the image sometimes looks like what I see. And when I look at the IP address associated with the chart it keeps changing, sometimes pointing to "Interactive Data Systems" space and sometimes to "7Ticks Consulting" space, and every once in a while to the 127.0.0.0 net which is a loopback to my own computer! The host name is an alias owned by the chart provider. When I track down the actual name server, it usually tells me some possibly legitimate DNS service provider, but sometimes it is a Linode server, and that server is the one providing the loopback net address. So, the chart provider is probably using a DNS service for dynamic load balancing and one of the DNS servers between me and him has been attacked and corrupted, and winds up pointing me to a bogus server. As a final bonus, I find the DNS system often times out, and does not actually provide my computer with any address, so I have to retry.

How do you get traffic intended for one server to another? Two ways - you somehow bamboozle the routers into sending the packets to the wrong destination, or you bamboozle the domain name system into providing the client with the wrong address for a server. While attacking the routers has been done, it is a totally distributed system with segments often controlled by malicious people. The last forty years have seen so many attempts that the router and ISP industry have a lot of experience with attacks, and pretty much have this under control. Along the way we have lost some interesting capabilities, but so be it.

The Internet domain name system is a completely different matter. It is a semi-distributed system and was designed in a time when efficiency and the ability to route around failures was important and security was an afterthought. Malicious failures were not a consideration, equipment and network failures were. Until recently, the top level has remained in relatively trustworthy hands, so challenges have been few, and experience with maliciousness is low. Now that control over this system is being distributed more widely, we can expect to see a lot more successful attacks until the industry adapts.

The domain name system is NOT totally distributed. It is a heirarchical system, with multiple redundant root name servers (13 to be precise) providing the top level. The root servers look at the rightmost part of the domain (the .com or .us or .edu at the end of the domain name), and tell you which name servers have authoritative information about that domain. Hints about the IP addresses of the root servers are compiled into the DNS clients, and can also be found through the domain system. Here is a table that shows information about them as of May 8, 2011.

A       198.41.0.4      379     Hong Kong       Verisign
B       192.228.71.201  86      Los Angeles     ISC/isi.edu
C       192.33.4.12     83      LAX             PSINet
D       128.8.10.90     147     College Park    Univ. of Maryland
E       192.203.230.10  Unreach ????????????    NASA
F       192.5.5.241     179     Palo Alto       ISC
G       192.112.36.4    Timeout Japan           US DoD
H       128.63.2.53     Hop     ????????????    US DoD
I       192.36.148.17   450     Hong Kong       RIPE/Sweden
J       192.58.128.30   218     Taipei          Verisign
K       193.0.14.129    239     Amsterdam       RIPE/NCC
L       199.7.83.42     283     Los Angeles     ICANN
M       202.12.27.33    178     Narita, Japan   Univ. of Tokyo

Name servers for any domain can delegate authority for a subdomain to another set of name servers, and are no longer the authority for names in that subdomain. For example the name servers that handle the .com domain delegate authority for "blogspot.com" to ns1.google.com, ns2.google.com, ns3.google.com and ns4.google.com. If every computer followed this chain for every name lookup the root servers would get overloaded pretty fast, so name servers tell you how long a piece of information they provide is good, and a DNS client can cache the information and does not have to retrieve it over and over. The protocol used to communicate between the DNS client and a name server is UDP i.e. Unreliable Datagram Protocol. To make things even more simple, most ISPs provide a "DNS server" which acts as the DNS client in the domain name system. You can send a name to this DNS server, and it follows all the steps necessary to figure out the IP address and provides it to you. When you connect to the ISP, the connection process automatically tells your computer about this DNS server. Buried in antiquity, but still implemented, is another shortcut - your computer will first try tacking on your "network DNS suffix" to the name you typed in whenever it attempts to get a translation for a hostname - so you can be lazy and leave that out when operating within your own network.

So how can you subvert this?

1) If you have authority over one of the root name servers, you could replicate enough of the translation chain to provide fake addresses for any domain you choose. All you need is for a DNS client to decide once to use your root server for a top level translation. Up until a few years ago, the US government or a US organization controlled the root name servers and the .com, .edu, .us, and .org name servers. With the formation of ICANN the responsibility for and location of the root servers began to move. As of May 2011, a.root-servers.net was located in Hong Kong. Soon after a series of embarrassing attacks on US servers from an "unnamed country" it looks like this machine was moved to the US, although its IP address has remained the same. Interestingly, the very last router on the path to this nameserver is now reported to be in Romania - not much of an improvement! This move (if deliberate!) was probably accomplished by using a very old but generally inaccessible (for security reasons) routing mechanism called a host route. NASA, DOD and PSINET all have their own root server.

2) You can also use the BGP based routing system to direct packets headed for a root server through a router that you have control over, and mangle the return any way you desire. If you want to intercept only one root server the one you would pick would be a.root-servers.net, because most resolutions would start there. The ability to re-route DNS messages exists for all servers, and is available to whoever has control of the intermediate routers. They can be the legitimate, but malicious or colluding owners of the routers, such as an ISP subject to a government order[2]. See the comments on the path to a.root-servers.net above.

3) If you have control over a "DNS Server" then you can feed whatever you want to the computers that rely on that "DNS Server". This most certainly happens. Earthlink (specifically the nameservers ns1.mindspring.com and ns2.mindspring.com) and Go Daddy for example use name servers that provide their own webserver IP address when you specify a name that does not exist. This allows them to put up a webpage with their ads when you mistype a name.

4) The ambiguity created by the name processing can be exploited. You type in "mail.yahoo.com" and expect to be connected there, but instead your name server returns an address for "mail.yahoo.com.mshome.net" using the default suffix. If you use a company owned computer, your company's domain is probably the default suffix. Guess who is able to read your email and monitor anything you are doing on the web as a result!

5) It used to be possible to supply a fixed address for a name in a hosts file. Because of the ability to compromise your files, this is marketed as a security risk. Also dynamic reassignment of servers by webfarms makes it impossible to use predefined translations and you are forced to use DNS. For whatever reason, this capability (static translation) no longer seems to work on many operating systems. However, if you really want to work without trusting any DNS service, this is the way to go.

6) DNS uses UDP to communicate. This protocol has no security and no sense of order. Combined with caching of nameserver addresses we can attack the system from the outside as follows: send a request for a non-existent address within a domain to the DNS server you are trying to attack. Immediately follow this with a fake DNS response that specifies the nameserver for the domain you are trying to attack. The nameserver specified is one under your control instead of the real one. If you meet the timing window, this will cause the DNS server to cache the nameserver, and until the timeout expires your nameserver has control over that domain. This type of attack actually took place in late 2007 early 2008. Originally, this worked for unrelated domains and that window has been closed in more secure versions of DNS. They also now try to use TCP instead of UDP - unfortunately because of the large numbers of nameservers and clients existent that have not been upgraded, DNS must continue to work with UDP.

7) You can attack the DNS code in the browser host itself. This is a variant of 4).

So here are a number of ways to attack the current system.

What can we do to enhance the security of connections so that we can be assured of connecting to the host we are actually trying to connect to? I think governments have a role to play - making it a lot more expensive to game the system - as well as providing support so they themselves can exert more control. A challenge for any solution in the network space is updating and interoperating with the huge base of existing clients and name servers.


1 - that picture has changed, here is the last one I saw. Here is another that I found on nasdaq.com.


2 - With requirements on ISPs embedded in ProtectIP and SOPA every US ISP should be regarded as untrustworthy! While President Obama has decided to oppose parts of SOPA, that comes about because the proposed law would undermine faith in DNSSEC, leading to use of simpler alternatives which cannot be compromised by the US government. Remember, DNSSEC relies on public key encryption systems, which the US NSA can crack, and more importantly, trace use.

Sunday, August 14, 2011

NETFLIX meets the Grim Reaper

We all know and love NETFLIX. Well - at least most of us! Nearly instant access to movies was the fundamental value proposition and since 2008 they have got a lot of loving and their stock shows it.

But now they have a problem - a big problem. It has always been there, even way back in 2002. How do you convince the media companies to let you have their programming? I should know - I gave up on a digital movie streaming model based on a monthly fee that year. They did not wait for permission and gave up on the "instant access" - instead they used the existing rental agreement framework used by companies like Blockbuster and came up with a much more efficient method of finding the CD or DVD you wanted, and getting it to your home by using the regular mail. So it was not really instant - but it  was much easier than driving to the rental store, and most importantly, much easier to return it! Whereas Blockbuster made money off people who were done watching and just could not get around to returning it (and possibly paying $30+ for a single view) when they did rent,  NETFLIX made $8 a month every month on movies that were sitting in your place still waiting to be viewed. You could theoretically watch movies for under a buck if you tried hard enough, but of course you were too busy for that. In 2008, with the economy tanking, and a lot of people with time on their hands needing entertainment, this was real value.

Then came REDBOX. They took that $1 price and made it a truly per view price. Back to the Blockbuster model, but now the price is much lower and the outlets are where you would be most days anyway, so it is even faster access than NETFLIX, but with a narrower selection. NETFLIX isn't looking so good anymore and to compete they go for digital streaming and "instant access".

There are two parties that hate this. The media companies are looking at pay per view, and with digital streaming on a monthly charge, this can get to really low values. Instead of the $8 being spread across 8-10 movies max a month, it is possible to watch 300+ movies a month. NETFLIX would make under 3c/movie at that rate, and that is not enough to make the media company happy. I know this happened - I spent a couple of weekends watching back to back episodes of some TV shows. The other party that hated this was the ISPs, their bandwidth requirement to the peering points and their business model just completely blows up. NETFLIX is unhappy too, because 3c is probably not enough to pay for the transmission cost of sending the movie from their site to the peering point.

The reaction was swift and extreme. ISPs instituted expensive data caps, and some repeatedly break the connection. This forces the stream to be resent. Upshot - I wound up paying $80 to watch one movie with repeated hangs! That was the end of NETFLIX for me. But NETFLIX also has a problem with the media companies.

There are technical solutions for the bandwidth problem, and also for the business problem. I sure hope Reed Hastings will find some that work, and make it possible for that vision in 2002 to be achieved.