Update (2011-Sep-14): UNIXy has built CDN software that you can run on your own hardware (or virtual machine) to build your own secure, flexible, private CDN. We’re working with individuals and organizations that are willing to trial the software. If you’re interested in joining the beta program for the CDN software, don’t hesitate to contact us. UNIXY clients are also welcome to join this program. Moreover they’ll benefit from our fully managed service and continued engineering assistance with the software. Do contact us today to get the ball rolling:https://www.unixy.net/secure/contact.php
Update (2011-Jul-07): Fastlayer, Varnish for the cloud, is born. Fastlayer is a multi-tenant accelerator software appliance built to run on as small as a virtual machine. Read more here: fastlayer.com
Update (2011-Feb-22): We’ve released a Varnish based cPanel script should you be interested in boosting server performance. The benchmarks (conducted & confirmed by third parties) have proved that Varnish+Apache is much faster than Nginx, Litespeed, & Lighttpd. Read more here: http://www.unixy.net/varnish
In this article, we shall outline the steps required to build a private Content Delivery or Distribution Network (CDN) using a VPS with Varnish Cache and Nginx. The goal is to build a CDN using free, readily available software but most importantly spend the least amount of funds possible. To this end, all nodes participating in this network are going to be virtual machines (Xen, Virtuozzo, OpenVZ, etc). Should you have any questions or comments on the configuration of this CDN, please post them in this forum: http://www.varnish-cache.info/
Our global CDN can not only keep the latest copy of static files closer to our global visitors but can also cache the most used pages (dynamic or not) in memory on the edge nodes! This means less trips to the geographically distant and slower Dynamic Node (see below). This is similar to what Akamai and other well known firms do, only at a fraction of the cost. However, in this article, and to keep things simple, we will only be caching static files.
Some of you might be surprised to learn that we built this global CDN free of charge for one of our beloved customers. UNIXY offers truly fully managed dedicated servers and clusters with managed servers starting as low as $45. Our motto is simple: what you cannot do with a few mouse clicks, we will gladly do it for you! Please visit us online when you have a chance: http://www.unixy.net. Please do ask if you have any question or comment. No question is minor!
The illustration below presents a logical layout of our CDN. Edge nodes can be located just about anywhere in the world. One could also add more nodes at any location should there be a capacity need. The Dynamic Content Node will typically run a mixture of MySQL, Apache, and server-side software built using PHP, Ruby, Python, .Net, or any language for that matter.
Nginx is a lightweight high-performance Web server that is able to handle large traffic consistently. We are leveraging its proxy and caching capabilities. We shall compile Nginx and leverage the proxy module. This module allows us to cache data on the local disks of the remote edge locations.
As its name implies, Varnish Cache is a high-performance caching engine used to keep recently accessed content in memory for fastest access. Varnish is not a Web server. Hence our need to bundle it with Nginx, which is acting as a Web server at the edge nodes. We will cover Varnish in detail in our next installment.
And finally the glue that holds all of these components together: BIND. BIND is the DNS software used to map Internet host names to IP addresses. We shall patch Bind to add geographical filters support. In other words, BIND will serve each client the IP of closest edge node in the CDN. For example, an vistor from Africa will receive the edge node IP of South Africa or Morocco depending on the filters. We will touch on this later.
At a minimum, we will need two nodes to demo and build our private CDN. That’s one Dynamic Content Node and one Edge Location node. The Dynamic Content Node will run the full LAMP stack along with BIND and the geographical filters patch. The Edge Location node will run Nginx and Varnish. One could always run BIND+GeoIP on a separate node as it is good practice. We will assign the hostname dynamic_content to the Dynamic Content Node and edge_node to the Edge Location.
Download BIND from ISC: http://www.bind9.net/download
Download MaxMind’s C API: http://geolite.maxmind.com/download/geoip/api/c/
[root@dynamic_node /]# cd /usr/src/[root@dynamic_node src]# wget http://mirrors.24-7-solutions.net/pub/isc/bind9/9.2.4/bind-9.2.4.tar.gz[root@dynamic_node src]# wget http://geolite.maxmind.com/download/geoip/api/c/GeoIP-1.4.6.tar.gz[root@dynamic_node src]# tar -xzvf bind-9.2.4.tar.gz[root@dynamic_node src]# tar -xzvf GeoIP-1.4.6.tar.gz[root@dynamic_node src]# cd GeoIP-1.4.6[root@dynamic_node GeoIP-1.4.6]# ./configure –prefix=/usr/local/geoip[root@dynamic_node GeoIP-1.4.6]# make[root@dynamic_node GeoIP-1.4.6]# make install[root@dynamic_node GeoIP-1.4.6]# cd ..[root@dynamic_node src]# patch -p0 < bind-9.2.4-geodns-patch/patch.diff[root@dynamic_node src]# cd bind-9.2.4[root@dynamic_node bind-9.2.4]# CFLAGS=”-I/usr/local/geoip/include” LDFLAGS=”-L/usr/local/geoip/lib -lGeoIP” ./configure –prefix=/usr/local/bind[root@dynamic_node bind-9.2.4]# make[root@dynamic_node bind-9.2.4]# make install
view “us” {// Match clients from US & Canadamatch-clients { country_US; country_CA; };// Provide recursive service to internal clients only.recursion no;zone “cdn.unixy.net” {type master;file “pri/unixy-us.db”;};zone “.” IN {type hint;file “named.ca”;};};view “latin” {// Match from Argentina, Chile and Brazilmatch-clients { country_AR; country_CL; country_BR; };// Provide recursive service to internal clients only.recursion no;zone “cdn.unixy.net” {type master;file “pri/unixy-latin.db”;};zone “.” IN {type hint;file “named.ca”;};};
[root@edge_node src]# wget http://nginx.org/download/nginx-0.8.45.tar.gz[root@edge_node src]# tar -xzvf nginx-0.8.45.tar.gz[root@edge_node src]# cd nginx-0.8.45[root@edge_node nginx-0.8.45]# ./configure –prefix=/usr/local/nginx –with-http_realip_module[root@edge_node nginx-0.8.45]# make[root@edge_node nginx-0.8.45]# make install
http {include mime.types;default_type application/octet-stream;sendfile on;keepalive_timeout 65;upstream dynamic_node {server 1.1.1.1:80; # 1.1.1.1 is the IP of the Dynamic Node}server {listen 81;server_name cdn.unixy.net;location ~* \.(gif|jpg|jpeg|png|wmv|avi|mpg|mpeg|mp4|htm|html|js|css|mp3|swf|ico|flv)$ {proxy_set_header X-Real-IP $remote_addr;proxy_pass http://dynamic_node;proxy_store /var/www/cache$uri;proxy_store_access user:rw group:rw all:r;}
A few seconds of browsing and the disk cache is already populating:
[root@edge_node /]# ls -al /var/www/cache
contact-unixy css images index.html javascript js
[root@edge_node /]#
[root@edge_node src]# wget http://downloads.sourceforge.net/project/varnish/varnish/2.1.2/varnish-2.1.2.tar.gz?use_mirror=cdnetworks-us-1&ts=1279434397[root@edge_node src]# tar -xzvf varnish-2.1.2.tar.gz[root@edge_node src]# tar -xzvf varnish-2.1.2.tar.gz[root@edge_node varnish-2.1.2]# ./configure –prefix=/usr/local/varnish[root@edge_node varnish-2.1.2]# make[root@edge_node varnish-2.1.2]# make install
backend default {.host = “127.0.0.1″;.port = “81″;}
sub vcl_recv {.
.
.
if (req.url ~ “\.(js|css|jpg|jpeg|png|gif|gz|tgz|bz2|tbz|mp3|ogg|swf)$”) {return (lookup);}
.
.
.}
sub vcl_fetch {.
.
.
if (req.url ~ “\.(js|css|jpg|jpeg|png|gif|gz|tgz|bz2|tbz|mp3|ogg|swf)$”) {unset obj.http.set-cookie;}
.
.
.}
The instructions above can replicated across however many additional Edge Nodes you want to add. One could also add redundancy to the BIND+GeoIP setup by configuring secondary nodes. The illustration below shows the flow of a request from top to bottom.
3 Responses to “How to build your own CDN using BIND, GeoIP, Nginx, and Varnish”
Hi, interesting setup.
A question – Does this setup also handles DNS resolver proximity? Because if you DNS server sits in US but you try to resolve from China you still get degraded performance.
Generally speaking: I’m personally intrigued to know how Akamai really works?
Maxim.
Hi Maxim,
The ratio of DNS queries to HTTP traffic is typically very negligible. So DNS response time lag for a visitor from China interrogating the GeoDNS node in the US is also negligible. DNS responses is cached for several minutes if not hours by local ISP resolvers so DNS “trips” are minimal.
Regards
How easy is it to add new CDN sites (e.g., an origin pull hostname) to the system? How would this be done?
Leave a Reply