Building a Better Switchboard with Docker

January 22nd, 2014 by rhaldar under News & Announcements, Posts By Alex, Switchboard

switchboard-docker-header

With Switchboard, we set out to develop a worldwide cloud service where Internet connection speed and reliability for our customers is everything.  Like many Internet service companies, we started out on Amazon Web Services (AWS) Elastic Compute Cloud (EC2). We settled in to an impressive and dependable set of functionality with EC2, but we knew that the honeymoon would soon be over; the locations and pricing of AWS is just too restrictive for what we need. With Docker, we’ve been able to happily move away from EC2, while offering a better, faster service, at a lower cost, without sacrificing any of the features of EC2 that we loved.

What’s to like about AWS EC2?

Amazon EC2 has eight locations spread around the world, and lets you create (or provision) a new server at any of these locations pretty quickly. You get charged for usage, so you pay by the hour for each server you have running. If you only need to spin up a new server for an hour or two to test whether some system updates will apply cleanly, you only pay for the time you had the new server running. EC2 makes it very easy to take a snapshot of a running server, and you can copy that snapshot (or AMI) to any of their locations, to bring up copies of that server. It also provides a consistent feature set. Whether you manage with the web interface, or script your operations using the APIs or command-line tools, you can manage all of your servers on EC2, regardless of their location. Finally, to paraphrase the popular cliche, “nobody ever got fired for buying IBM”, nobody ever got fired for choosing AWS. Running your service on AWS means that if the data center hosting your stuff goes down, chances are good that all the tech news sites are going to be discussing the downtime as it happens, and in general, the reliability and uptime of AWS and EC2 is really good.

What’s wrong with EC2?

While the AWS data centers are in many key locations, they’re not close enough to our customers. We found out right away from our Early Access customers in Europe that the EC2 Ireland data center offered pretty poor speeds to mainland Europe, so we knew that we would need more locations on the mainland. Looking at where our customers were, and what sorts of speeds they needed, we found that we’d need servers in France right away, with Spain, Germany, and the Netherlands to follow. Similarly in the US, the EC2 data centers on the coasts offered pretty poor speeds for our customers in the middle of the country, and even for Southern California and Florida.

Pricing on AWS is also really restrictive, both for bandwidth costs and for the performance per dollar. For a network-intensive virtual private server (VPS) in the US with two CPU cores and 2GB of RAM, here is how AWS compares to an arbitrary VPS provider:

Service Base Cost / month Bandwidth Cost (10TB) Total Monthly Cost
2host A5 XEN $30 $0 $30
AWS c1.medium $104 $1200 $1304

Who are the competitors?

With AWS not cutting it, our options were to go with a multitude of VPS and dedicated server providers, or work out deals to colocate our own bare metal in data centers where we need them.  There are a LOT of VPS providers, from the large (Rackspace, Digital Ocean, GoGrid, Microsoft Azure, Google Compute Engine, Verizon Cloud) to the small (take your pick here). Similarly, there are a lot of data centers offering colocation.

Many of the big VPS providers suffer from the same problems that AWS does (non-ideal locations and high prices). The small VPS providers are really a mixed bag of pricing models and features, and each needs to be evaluated to get a feel for how oversold their servers and network connections are. Colocating bare-metal around the world is a pretty expensive and lengthy process, and in the end, the colocation fees can easily outweigh what a VPS setup would cost to provide the level of service needed for that location.

We opted to move towards a mix of big and small VPS providers. We are selecting based on location and price/performance ratio, and vetting actual reliability and performance. This lets us roll out to a lot of locations very quickly, adding redundancy, such as using different providers in different data centers for the same city.

So, while the move to many VPS providers lets us put servers right where our customers need them, and at an overall lower cost, it comes at the expense of what we loved about EC2. Namely, the consistent management feature set and ease of provisioning new servers.  Thankfully, with a tool like Docker running on all our servers, we can get back to a consistent management interface, above and beyond what we could do with EC2.

So what’s Docker?

We went looking for a tool that would help us implement a consistent approach on our heterogeneous mix of servers, and we found Docker. Docker is built on top of Linux Containers (LXC), which lets you put an application into a container, almost like a miniature virtual machine that shares the host’s resources and runs on the host’s kernel, but otherwise isolates itself from everything else on the system. Docker adds some special sauce on top of LXC, making it much easier to create, manage, and run these containers. We’ve been watching the Docker project for awhile now, and we’re very excited to finally include it in our infrastructure.

How can Docker help?

Docker allows us to address many of the advantages we lost when moving away from EC2. With Docker, we can have a new container up-and-running in a fraction of a second, which is even faster than provisioning a new machine on EC2. Our most common reason for provisioning a new server is either to apply software upgrades to an existing location without affecting service or to serve a new location altogether. Docker easily lets us spin up a new Linux container running the distro of our choice and fire up our application in one fell swoop.

It also allows us to easily snapshot any running server, and transfer that to any of our locations. We can version control every change we make to our servers, and we can export a container as a single file to transport to any other location as we see fit. We can also run a private registry, similar to the Docker Index, and push our container images to a central repository that is private and that we control.

Finally, it provides a consistent feature set. Management tasks for all of the containers running our application stack are the same across all providers. We can script what we need on top of Docker’s command-line interface or use its REST API.

What does this mean for our customers?

All of this means that we can pick up the pace and roll out to more and more Switchboard Speed Server locations. It means that server upgrades won’t result in any disruption of service. And it means that Switchboard will get bigger, better and faster.