First post. It seems like a little overview of Ravelry’s network, hardware, and other systems information would be a sensible way to start things off.
In this post: VPSes, Linux, Virtualization, hosting and bandwidth, Amazon S3
a starter home
If you are thinking about embarking on a web development project, I’d highly recommend that you find a host that can provide you with a VPS (Virtual Private Server). A small VPS can be had for as low as $10-$20. Unlike a traditional dirt cheap shared host, you have full control over an entire virtual server plus the flexibility to add memory and disk space when you are ready for it. I used (and am still using) a host called Rimuhosting – their pricing is competitive, their support people are very responsive, and they offer a pretty good menu of options including dedicated servers.
Jess and I started fooling around with ideas and code back in January. It would have been silly and expensive for us to build out our systems before we needed them, so we started with a small VPS. (Okay – so Ravelry wasn’t really in our living room when we started – it was sitting on a machine in Texas.
Eventually, we outgrew the VPS and moved up to an inexpensive leased dedicated server with the same host. When the dedicated server started to feel pokey because of limited resources, we knew that it was time to design and build our “real” setup. At this point, Ravelry was out of the starting gate and we weren’t afraid to invest money in infrastructure.
I knew that I wanted to own and manage our own servers and host them in a conveniently located datacenter. We live in Boston, so it wasn’t to hard to find a good datacenter nearby. We ended up getting a half cabinet with a company called Hosted Solutions because they were convenient, friendly, and not absurdly huge.
To get started, I decided that we’d need two big servers (ripe for virtualization, keep reading…), a firewall, and a gigabit switch. We overdid it a little – our budget was $10K and Dad had to give us a hand with the firewall purchase (thanks, Dad!). Still, I was very happy with what we put together. Here is what I was shopping for:
- Inexpensive Linux servers with quality components and lots of options – I chose Silicon Mechanics.
- Intel Xeons. I don’t know if AMD64 + Linux still comes with a few guaranteed headaches but I wasn’t about to find out. Plus, I really wanted to try out those dual quad core Xeons
- Piles of memory. With many Mongrels and 2 MySQLs, we’ve got lots of mouths to feed. 16 GB is the current sweet spot.
- Serial ATA RAID – 4 drives + RAID10 = speed and durability on a budget Also, plenty of disk space. Serial ATA is cheap and running out of space is no fun.
The other components were pretty basic – a Dell managed gigabit switch and a Cisco ASA firewall. Good, standard stuff. Little fuss.
<– Bob, stress testing the firewall and switch
avoiding those emergency trips to the datacenter
Do yourself a favor and splurge on IPMI or integrated KVM over IP. Sometimes this is built into an onboard card, sometimes it’s just a little extra add-on. It’s really nice to be able to have console access from home without bothering with more hardware, more wires, more crap.
Even better, my machines have this IPMI business which also provides remote access for power off/on, reset, and various sensors. Pretty much all of the management you need and served just the way you want it – over the network, no extra equipment, little extra expense.
It was especially convenient when I was setting the machines up in the datacenter. Just the ability to cable my laptop to a server for management without a keyboard and display was worth the added expense.
My IPMI console: not beautiful, but very useful:
which flavor of Linux?
Gentoo. Portage (Gentoo’s package system) rules. I don’t want your crummy binary packages. I want the “configure/install from the source” flexiblilty with the easy upgrading, administration, and dependency handling that a package manage provides. Also, Gentoo’s init and filesystem layouts are nice.
Gentoo is my favorite Linux and I’m very comfortable with it. I encourage you use your favorite Linux if that is what you are most comfortable with.
My favorite part. I’m probably driving people nuts with the yammering about Xen and virtualization all the time. In short, I took my two big servers and turned them into 8 “just right” virtual servers. If you want to learn a little more about what virtualization actually is, check out the overview pages over at XenSource.
Okay, why virtualize? My own opinion:
- Today’s servers are big – you will likely waste your resources if you run one “server” per machine.
- An system installed directly on your bare hardware is a going to be less flexible, more brittle, and more of a headache to maintain and manage.
- Easy migration and duplication – if your machines are all virtual you can easily shuffle them around, duplicate them, and so on
I can’t stress how happy I am that all of our server instances are virtual. Management and upgrading is actually fun. Do it. Really.
Pass the Xen, please.
Xen is the king of virtualization. It is a fantastic open source product and it costs you nothing. Three really important to think about before you start:
If you need to run any 64 bit virtual machines, Xen needs to be running on a 64 bit Linux and all of your VMs have to run on 64 bit kernels. Remember that you can conveniently run an otherwise all 32-bit linux with a 64 bit kernel. This is what we are doing – 64 bit kernels everywhere but only 64 bit software on the MySQL hosts that need them.
USE LVM FOR YOUR DISKS. If you are virtualizing but sticking with traditional disk partitions or image files for your disks, you are losing out on a lot of flexibility and performance. Use LVM2 to set up your disks – the HOWTO is here. With LVM, you’ll have flexible disks that you can create, delete, resize, and snapshot as needed. Want to be able to duplicate one of your virtual servers while it is running and start up a new copy for testing or as a base for a new machine? You need LVM.
If you are doing it all from scratch (including building the kernel) like I did, you’ll probably render your machine unbootable a couple times. Have a boot CD handy. The Gentoo Xen HOWTO is quite handy.
I can’t say enough good things about how freeing it is to have a large group of virtual machines that are riding on Xen and LVM. I can move servers to other machines, upgrade software, and do all kinds of other world-changing things with ease.
Xen’s “top” – this box is running 4 virtual machines: 1 web server, 2 app servers, 1 slave MySQL:
bandwidth: our biggest expense
Here are a few tips for not wasting money on bandwidth:
Shop around! You’d be surprised how much prices can vary. Make sure to be aware of the reliability/redundancy behind the bandwidth that you are purchasing. In-datacenter bandwidth is often a “safer” blend of several different carriers and that security comes at an added expense.
Take full advantage of Amazon’s S3 storage service! If you are hosting any large images on your site, you should seriously consider moving them to Amazon. Take a look at the pricing and think about how it would work for you. It is really cheap.
Divide things up logically and don’t skimp on hostnames/virtual hosts. If you separate all of your static content hosts now (even if they are the names all point to the same machine) you’ll have an easier time fiddling with things later. Maybe we went overboard with avatars.ravelry, images.ravelry, images3.ravelry, assets.ravelry, creative.ravelry, etc… but the division has made things a lot easier for me. As an example, we’re looking at buying some cheaper in-datacenter bandwidth from a single carrier and if we decide to do it, it will be trivial for me to offload a bunch of those hostnames to the new connection. It also makes life easier when it comes to usage reporting on all of these different types of resources because the log files are already split out.
any questions or comments?
I guess that’s it for server/network related stuff! If you have any questions that are related to the things that I wrote about (curious Ravelry users – I’m talking to you) leave a comment and I’d love to answer in the next post.
Next time: next time, I might talk about how I monitor and manage all of these virtual machines It’s not much work at all and little ecosystem of management software is pretty interesting.