Update (2011-03-24): In this article. we’re discussing how we leveraged the smart Russian-built Web server Nginx to stop a DDoS attack. We’ve also experimented quite a bit with Varnish, another fine half-Danish half-Norwegian open source software, as a DDoS mitigation tool. It can be made to form the same constellation that we used for Nginx but can perform much better in the face of a large DDoS. So it’s worth pursuing should you be interested (read: desperate). Shameful plug: check out our cPanel Varnish plugin: http://www.unixy.net/varnish
In this post we (UNIXY) are going to share our experience fending off a large Distributed Denial of Service (DDoS) attack for a client. Generally, Website owners deal with DDoS attacks on their own. There are equipment and solutions vendors cater to these owners and guarantee protection against these kind of attacks up to a certain threshold. The cost of hiring these vendors can range from thousands to hundreds of thousand or millions of dollars depending on the severity of the attack.
Our goal was to build a solution with the least amount of funds possible. This solution is scalable and can handle the worst attacks. The client’s dedicated server is not a special server but a simple quad core Xeon managed server running the LAMP stack. The DDoS riposte described in this article can scale to stop a 10Gbps attack or more. The good news is this solution does not require changing anything on the dedicated server itself. The server could be running just about any software stack. This configuration will work just fine with almost all cases effortlessly.
Before we delve into the glorious technical details, there is an important aspect of DDoS attacks that one should know about; that is the social dynamics that lead to the attack. The more one understands about the the social aspect of a DDoS attack the easier it becomes to prevent or stop it. Because once a DDoS has started, priorities shift quite dramatically and rational for making wise decisions becomes flawed.
DDoS attacks do not occur randomly. They are targeted and come with a motive. The motive could be revenge but most of the time the motive is financial. The individual or groups that conduct the DDoS attacks are most of the time hired to complete the job. They have the resources and know-how to orchestrate the attack while hoping to avoid getting caught by the authorities. They have no emotional attachment to the DDoS attack itself; they have no hard feelings towards the victim. They just get paid for what they do and nonchalantly, but meticulously, execute.
As explained, DDoS attacks are preceded by an email, post, or phone call, from the individual or group with interest, to the victim. It is always recommended to treat strangers you meet online or offline professionally and politely. The smallest altercation can lead to a negative reaction, which can escalate actions. In the face of anonymous threats against your business or organization, remain calm and composed.
There are public markets online (please don’t ask for links) where wannabe DDoS perpetrators get to hire the attackers. Pricing varies from $5/hr to $10 for a simple non-distributed DoS attack. A DDoS, however, tends to be more expensive depending on the sheer amount of data or packets that needs to be delivered at the target. It can range from $20/hr to $100/hr. The word used to in the circles in lieu of DDoS is to “drop;” meaning to drop a certain Web site or network off the Internet. It really means to either overwhelm the target with enough traffic that the equipment fails or to force upstream providers to “null route” the destination IP at the network level. The end result is that the IP gets dropped from the routing tables and the server to stop responding to all requests.
The fact that DDoS is not cheap has got to be comforting to an extent. It means that it is only a matter of time before the DDoS “client” runs out of cash. This in itself is encouraging. Keep that in mind should you begin to lose patience. Perseverance is omnipotent. Denial of service attacks are considered a crime and are punishable by Federal law in the US and by the police in the UK. As we will explain in the technical part of this article, DDoS attacks are almost impossible to trace to back to the individual or group that are orchestrating the attack. Because of the distributed nature, it requires cooperation from several network engineers that work for upstream providers.
Distributed Denial of Service – The Technicals
First things first, What is a DoS? what is the difference between a DoS and DDoS? A Denial of Service (DoS) is an attack originating from one source or one system that results in the service in question being unavailable to its legitimate users. It denies its very users access either because the service runs out of available resources or has been tricked to deny access to legitimate users. For example, a DoS attack on a Web server can cause it to run out of resources and stop responding to requests. A DDoS, on the other hand, is a more sophisticated attack since the attack originates from hundreds or thousands or nodes.
A DDoS attack is almost impossible to trace back to the source due to its distributed nature. DDoS orchestrators call the nodes and controller system a “bot.” With a few commands, the bot owner can instruct infected nodes from around the world to attack a target. The bot systems are hosted and controlled via the Internet Relay Chat (IRC) system or via a direct connection port connection. The nodes used to attack the target are made of compromised Windows and Linux nodes from around the world.
Before we present our solution, we need to discuss the two types of DDoS attacks that exist. On one hand you have attacks are bandwidth-based and seek to saturate the connectivity link. On the other hand, you have attacks that are packet-based and seek to saturate the processing capability of the equipment. In other words, they seek to overwhelm the processing power of the CPU and memory or fabric of the routers or switches. All equipment has hard limits when it comes to their ability to handle a certain number of packets per second. Routers and switches are no exception.
For example, take the above specification for a Cisco 6500 firewall. Each module is able to handle 5Gbps or 2.8 million pps. This firewall sure looks like it can handle a 5Gbps attack. Great! However, should there be a packet-based DDoS attack, one would only need a 1.5Gbps payload to saturate it. That’s 2.8 million pps * 64 Bytes = 1.5Gbps. So bandwidth capacity means nothing by itself and small packets can cause havoc.
Our client was facing a 2Gbps DDoS attack that is packet based. It sought to force routing equipment along the way to start dropping legitimate packets. This caused the upstream to null route the IP to alleviate the burden on other customers that are behind the link. This is the typical reaction from all upstreams as they seek to protect their many other customers from feeling the pinch of the attack. We were given one last chance to “fix” things before the IP could be routed back in. Here is how we were able to fend off the attack and keep the server running.
We have deployed what we call a “constellation” of reverse proxy VM or VPS nodes running the high performance Web server Nginx. The VM nodes were purchased from several providers given they are located at separate facilities. Essentially, we are off-loading and “splitting” both packet processing and bandwidth consumption across several data center facilities (physical routers & carriers).
The configuration of the Nginx nodes is a typical reverse proxy configuration with the usual extra kernel security configuration. So for a 2Gbps attack and with 20 VM nodes, the bandwidth consumption per node is a maximum of 2GBps / 20 = 100Mbps. That’s a 100Mbps load per VM node, which is reasonable enough and is below the threshold for getting one’s IP null routed by the provider. One could add more and more Nginx nodes to the constellation without issues.
So how is 20 VM nodes going to be affordable? VM prices have dropped dramatically over the last year. For the above configuration, a VM can cost between $5/mo and $10/mo. That’s an average of $8*20 = $160/Mo. Knowing that most DDoS attackers have the attention span of a gold fish, the $160 is all you need to send your attacker and his accomplice packing.
Let’s talk more about the Nginx constellation configuration. The Nginx front-end nodes will run in proxy mode caching static files and requests. The more aggressive the DDoS the higher the time-to-live for cache objects should be. This prevents the Nginx nodes from proxy-passing requests to the quad core node. Although, if the main node has idle CPU and plenty of memory it wouldn’t hurt to put it to good use to alleviate the burden on the Nginx front nodes. Your domain’s A records is going to be the IP of the Nginx front nodes configured in round robin fashion. DNS round robin has its shortcomings in terms of not having control over how long (bad) records get cached by resolvers around the world. But in this case, it does not matter much. Just be sure to set high TTL for the records so your DNS server does not collapse under the enormous volume.
There are tons of online tutorials that go over the installation of Nginx as a reverse proxy so be sure to read up on it. But we will list some of the peculiar settings that are needed to handle a large scale DDoS. Of importance is the number of Nginx worker processes and worker connections. Those values will need to adjusted gradually and higher to handle different kind of attacks depending the VM resource allocation. But you should set them at least as high as the following:
worker_connections 4096; # Be sure to set ulimit -n 4096 or more
Keep in mind that one still needs to gear up for the event by setting kernel and system variables on the Nginx nodes. Simple things like per-IP rate limiting, flooding rate limits, and syn cookies should be enabled without a question. Here are some measures you can implement:
net.ipv4.tcp_syncookies = 1
# source validation / reversed path
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
kernel.pid_max = 65536
net.ipv4.ip_local_port_range = 9000 65000
In brief, here are the elements that constitute our solution:
That’s all folks. We hope you enjoyed this article. Should you have any question or comment, don’t hesitate to get in touch! No question is minor and we are always looking for feedback.