How to build your own CDN using BIND, GeoIP, Nginx, and Varnish

Update (2011-Sep-14): UNIXy has built CDN software that you can run on your own hardware (or virtual machine) to build your own secure, flexible, private CDN. We’re working with individuals and organizations that are willing to trial the software. If you’re interested in joining the beta program for the CDN software, don’t hesitate to contact us. UNIXY clients are also welcome to join this program. Moreover they’ll benefit from our fully managed service and continued engineering assistance with the software. Do contact us today to get the ball rolling:https://www.unixy.net/secure/contact.php

Update (2011-Jul-07): Fastlayer, Varnish for the cloud, is born. Fastlayer is a multi-tenant accelerator software appliance built to run on as small as a virtual machine. Read more here: fastlayer.com

Update (2011-Feb-22): We’ve released a Varnish based cPanel script should you be interested in boosting server performance. The benchmarks (conducted & confirmed by third parties) have proved that Varnish+Apache is much faster than Nginx, Litespeed, & Lighttpd. Read more here: http://www.unixy.net/varnish

In this article, we shall outline the steps required to build a private Content Delivery or Distribution Network (CDN) using a VPS with Varnish Cache and Nginx. The goal is to build a CDN using free, readily available software but most importantly spend the least amount of funds possible. To this end, all nodes participating in this network are going to be virtual machines (Xen, Virtuozzo, OpenVZ, etc). Should you have any questions or comments on the configuration of this CDN, please post them in this forum: http://www.varnish-cache.info/

Our global CDN can not only keep the latest copy of static files closer to our global visitors but can also cache the most used pages (dynamic or not) in memory on the edge nodes! This means less trips to the geographically distant and slower Dynamic Node (see below). This is similar to what Akamai and other well known firms do, only at a fraction of the cost. However, in this article, and to keep things simple, we will only be caching static files.

Some of you might be surprised to learn that we built this global CDN free of charge for one of our beloved customers. UNIXY offers truly fully managed dedicated servers and clusters with managed servers starting as low as $45. Our motto is simple: what you cannot do with a few mouse clicks, we will gladly do it for you! Please visit us online when you have a chance: http://www.unixy.net. Please do ask if you have any question or comment. No question is minor!

  • The Big Picture

The illustration below presents a logical layout of our CDN. Edge nodes can be located just about anywhere in the world. One could also add more nodes at any location should there be a capacity need. The Dynamic Content Node will typically run a mixture of MySQL, Apache, and server-side software built using PHP, Ruby, Python, .Net, or any language for that matter.

 

Global CDN

Global CDN

 

  • Role of Each Software Component

Nginx is a lightweight high-performance Web server that is able to handle large traffic consistently. We are leveraging its proxy and caching capabilities. We shall compile Nginx and leverage the proxy module. This module allows us to cache data on the local disks of the remote edge locations.

As its name implies, Varnish Cache is a high-performance caching engine used to keep recently accessed content in memory for fastest access. Varnish is not a Web server. Hence our need to bundle it with Nginx, which is acting as a Web server at the edge nodes. We will cover Varnish in detail in our next installment.

And finally the glue that holds all of these components together: BIND. BIND is the DNS software used to map Internet host names to IP addresses. We shall patch Bind to add geographical filters support. In other words, BIND will serve each client the IP of closest edge node in the CDN. For example, an vistor from Africa will receive the edge node IP of South Africa or Morocco depending on the filters. We will touch on this later.

  • Node Layout

At a minimum, we will need two nodes to demo and build our private CDN. That’s one Dynamic Content Node and one Edge Location node. The Dynamic Content Node will run the full LAMP stack along with BIND and the geographical filters patch. The Edge Location node will run Nginx and Varnish. One could always run BIND+GeoIP on a separate node as it is good practice. We will assign the hostname dynamic_content to the Dynamic Content Node and edge_node to the Edge Location.

  • Installation and configuration

Download BIND from ISC: http://www.bind9.net/download

Download MaxMind’s C API: http://geolite.maxmind.com/download/geoip/api/c/

[root@dynamic_node /]# cd /usr/src/
[root@dynamic_node src]# wget http://mirrors.24-7-solutions.net/pub/isc/bind9/9.2.4/bind-9.2.4.tar.gz
[root@dynamic_node src]# wget http://geolite.maxmind.com/download/geoip/api/c/GeoIP-1.4.6.tar.gz
[root@dynamic_node src]# tar -xzvf bind-9.2.4.tar.gz
[root@dynamic_node src]# tar -xzvf GeoIP-1.4.6.tar.gz
[root@dynamic_node src]# cd GeoIP-1.4.6
[root@dynamic_node GeoIP-1.4.6]# ./configure –prefix=/usr/local/geoip
[root@dynamic_node GeoIP-1.4.6]# make
[root@dynamic_node GeoIP-1.4.6]# make install
[root@dynamic_node GeoIP-1.4.6]# cd ..
[root@dynamic_node src]# patch -p0 < bind-9.2.4-geodns-patch/patch.diff
[root@dynamic_node src]# cd bind-9.2.4
[root@dynamic_node bind-9.2.4]# CFLAGS=”-I/usr/local/geoip/include” LDFLAGS=”-L/usr/local/geoip/lib -lGeoIP” ./configure –prefix=/usr/local/bind
[root@dynamic_node bind-9.2.4]# make
[root@dynamic_node bind-9.2.4]# make install
Bind-GeoIP comes with a named.conf file with examples on how to use filtering. Setup your zone files and test them accordingly. The GeoIP patch official page has instructions and examples. Be sure to read over it should you need help: http://www.caraytech.com/geodns/. If you do not have access to nodes in the different geo locations around the world to test your BIND configuration, http://traceroute.org is a good resource to leverage. It allows one to test DNS resolution using a looking glass (ping).
Here is how the filters should look inside named.conf:
view “us” {
// Match clients from US & Canada
match-clients { country_US; country_CA; };
// Provide recursive service to internal clients only.
recursion no;
zone “cdn.unixy.net” {
type master;
file “pri/unixy-us.db”;
};
zone “.” IN {
type hint;
file “named.ca”;
};
};
view “latin” {
// Match from Argentina, Chile and Brazil
match-clients { country_AR; country_CL; country_BR; };
// Provide recursive service to internal clients only.
recursion no;
zone “cdn.unixy.net” {
type master;
file “pri/unixy-latin.db”;
};
zone “.” IN {
type hint;
file “named.ca”;
};
};
Let us move on now and install Nginx and Varnish.
[root@edge_node src]# wget http://nginx.org/download/nginx-0.8.45.tar.gz
[root@edge_node src]# tar -xzvf nginx-0.8.45.tar.gz
[root@edge_node src]# cd nginx-0.8.45
[root@edge_node nginx-0.8.45]# ./configure –prefix=/usr/local/nginx –with-http_realip_module
[root@edge_node nginx-0.8.45]# make
[root@edge_node nginx-0.8.45]# make install
Here is our nginx.conf file with relevant lines only. All other configuration options are stock Nginx:

 

http {
include       mime.types;
default_type  application/octet-stream;
sendfile        on;
keepalive_timeout  65;
upstream dynamic_node {
server 1.1.1.1:80; # 1.1.1.1 is the IP of the Dynamic Node
}
server {
listen       81;
server_name  cdn.unixy.net;
location ~* \.(gif|jpg|jpeg|png|wmv|avi|mpg|mpeg|mp4|htm|html|js|css|mp3|swf|ico|flv)$ {
proxy_set_header  X-Real-IP  $remote_addr;
proxy_pass http://dynamic_node;
proxy_store /var/www/cache$uri;
proxy_store_access user:rw group:rw all:r;
}
In bold above are configuration lines that are key and define our private CDN. The upstream is essentially going to be our Dynamic Node to which we pass requests that cannot be served from cache. Also, Nginx will only be caching static files like GIF, PNG, and JS. Varnish on the other hand will be caching dynamic pages. Notice how Nginx listens on port 81. This is because Varnish will listen on port 80 and will forward requests to Nginx on port 80. More on Varnish later.
Notice how we are using cdn.unixy.net as the handle for our virtual host name. It can be just about anything depending on your configuration. Once the cache builds up, you should start seeing files and directories being populated under /var/www/ as instructed above.

A few seconds of browsing and the disk cache is already populating:

[root@edge_node /]# ls -al /var/www/cache
contact-unixy css images index.html javascript js
[root@edge_node /]#

Next we will proceed with installing Varnish. Varnish will act an in-memory cache. While it is not necessary, it can improve response time greatly. Nonetheless, installing Varnish does add a level of complexity to our configuration.
[root@edge_node src]# wget http://downloads.sourceforge.net/project/varnish/varnish/2.1.2/varnish-2.1.2.tar.gz?use_mirror=cdnetworks-us-1&ts=1279434397
[root@edge_node src]# tar -xzvf varnish-2.1.2.tar.gz
[root@edge_node src]# tar -xzvf varnish-2.1.2.tar.gz
[root@edge_node varnish-2.1.2]# ./configure –prefix=/usr/local/varnish
[root@edge_node varnish-2.1.2]# make
[root@edge_node varnish-2.1.2]# make install
Be sure to follow guides online on the initial setup of Varnish. This article only covers the configuration of the CDN. There are certainly additional Varnish options that need tuning but those are most likely peculiar to your application.
backend default {
.host = “127.0.0.1″;
.port = “81″;
}
sub vcl_recv {
.
.
.
if (req.url ~ “\.(js|css|jpg|jpeg|png|gif|gz|tgz|bz2|tbz|mp3|ogg|swf)$”) {
return (lookup);
}
.
.
.
}
sub vcl_fetch {
.
.
.
if (req.url ~ “\.(js|css|jpg|jpeg|png|gif|gz|tgz|bz2|tbz|mp3|ogg|swf)$”) {
unset obj.http.set-cookie;
}
.
.
.
}
Go ahead and startup Varnish and browse around you portal a bit to build the cache. Monitor the command varnishstat on the edge node and you will be able to see the cache hits and misses. There should be more hits as the cache builds up over time and more objects are accessed.

 

  • Wrap up

The instructions above can replicated across however many additional Edge Nodes you want to add. One could also add redundancy to the BIND+GeoIP setup by configuring secondary nodes. The illustration below shows the flow of a request from top to bottom.

 

 

CDN Built Using Nginx and Varnish

CDN Request and Response Flow

That’s all folks! I hope you enjoyed this article.

14 Responses to “How to build your own CDN using BIND, GeoIP, Nginx, and Varnish”

  1. Quora - September 20, 2010

    How does Akamai CDN work?…

    While it’s not specific to Akamai’s implementation, you may find the article at http://blog.unixy.net/2010/07/how-to-build-your-own-cdn-using-bind-geoip-nginx-and-varnish/ to provide some useful information on this topic….

  2. RZ mit weltweiten Peers - WHL Community Foren - November 16, 2010

    [...] Aber auch NGINX kann GeoIP direkt. Hier ein Blog von einem, der sich das selbst gestrickt hat: http://blog.unixy.net/2010/07/how-to…x-and-varnish/ [...]

  3. Weekend Reading: Its Been a While - November 25, 2010

    [...] How to build your own CDN using BIND, GeoIP, Nginx, and Varnish [...]

  4. Maxim Veksler - February 20, 2011

    Hi, interesting setup.

    A question – Does this setup also handles DNS resolver proximity? Because if you DNS server sits in US but you try to resolve from China you still get degraded performance.

    Generally speaking: I’m personally intrigued to know how Akamai really works?

    Maxim.

  5. UNIXy - February 22, 2011

    Hi Maxim,

    The ratio of DNS queries to HTTP traffic is typically very negligible. So DNS response time lag for a visitor from China interrogating the GeoDNS node in the US is also negligible. DNS responses is cached for several minutes if not hours by local ISP resolvers so DNS “trips” are minimal.

    Regards

  6. Mark - February 26, 2011

    How easy is it to add new CDN sites (e.g., an origin pull hostname) to the system? How would this be done?

  7. Sam Hamilton - June 8, 2011

    Thanks for the interesting article. Was just wondering wow come you decide to use both Varnish and Nginx? Why not just use Nginx?

    Thanks
    Sam

  8. Yo - November 12, 2011

    Hello,

    Why not use Squid in this setup? Benefits of using Varnish instead of Squid?

  9. How to build your own CDN using BIND, GeoIP, Nginx, Varnish | Video Breakthroughs | Scoop.it - December 14, 2011

    [...] How to build your own CDN using BIND, GeoIP, Nginx, Varnish [...]

  10. Quora - December 22, 2011

    What would be total cost for Content delivery network setup in India?…

    CDN costs will be a function of Total Bandwidth that you consume on a monthly basis. Given that there are very few major ISPs and that your audience is possibly going to be situated close to the major metros, you might even consider rolling your own wi…

  11. viatwitter 19 by simonrobic - Pearltrees - January 9, 2012

    [...] How to build your own CDN using BIND, GeoIP, Nginx, Varnish | UNIXy [...]

  12. Build your own CDN — Matt Geri - January 20, 2012

    [...] How to build your own CDN using BIND, GeoIP, Nginx, and Varnish [...]

  13. Johan - February 15, 2012

    First of all, thanks for this great tutorial.
    I’m soon to upgrade my website to using a cdn, I have already written all my scripts to use http://cdn.mydomain.com, but currently, that subdomain is on the same server. On the dedicated server I’m renting, how would I change those subdomains settings to use BIND as explained?

  14. 5 CDN Software Solutions - Data Center Map Blog - April 24, 2012

    [...] are of course other solutions out there, as well as the Do It Your Self approach with for example BIND DNS, GeoIP and Ngihnx, but the purpose of this post is just to give [...]

Leave a Reply

Comment moderation is enabled. Your comment may take some time to appear.


Search The Blog







Categories