Moving out of the living room

First post. It seems like a little overview of Ravelry’s network, hardware, and other systems information would be a sensible way to start things off.

In this post: VPSes, Linux, Virtualization, hosting and bandwidth, Amazon S3

a starter home

If you are thinking about embarking on a web development project, I’d highly recommend that you find a host that can provide you with a VPS (Virtual Private Server). A small VPS can be had for as low as $10-$20. Unlike a traditional dirt cheap shared host, you have full control over an entire virtual server plus the flexibility to add memory and disk space when you are ready for it. I used (and am still using) a host called Rimuhosting - their pricing is competitive, their support people are very responsive, and they offer a pretty good menu of options including dedicated servers.

Jess and I started fooling around with ideas and code back in January. It would have been silly and expensive for us to build out our systems before we needed them, so we started with a small VPS. (Okay - so Ravelry wasn’t really in our living room when we started - it was sitting on a machine in Texas.

Eventually, we outgrew the VPS and moved up to an inexpensive leased dedicated server with the same host. When the dedicated server started to feel pokey because of limited resources, we knew that it was time to design and build our “real” setup. At this point, Ravelry was out of the starting gate and we weren’t afraid to invest money in infrastructure.

growing up

I knew that I wanted to own and manage our own servers and host them in a conveniently located datacenter. We live in Boston, so it wasn’t to hard to find a good datacenter nearby. We ended up getting a half cabinet with a company called Hosted Solutions because they were convenient, friendly, and not absurdly huge.

To get started, I decided that we’d need two big servers (ripe for virtualization, keep reading…), a firewall, and a gigabit switch. We overdid it a little - our budget was $10K and Dad had to give us a hand with the firewall purchase (thanks, Dad!). Still, I was very happy with what we put together. Here is what I was shopping for:

  • Inexpensive Linux servers with quality components and lots of options - I chose Silicon Mechanics.
  • Intel Xeons. I don’t know if AMD64 + Linux still comes with a few guaranteed headaches but I wasn’t about to find out. Plus, I really wanted to try out those dual quad core Xeons :)
  • Piles of memory. With many Mongrels and 2 MySQLs, we’ve got lots of mouths to feed. 16 GB is the current sweet spot.
  • Serial ATA RAID - 4 drives + RAID10 = speed and durability on a budget Also, plenty of disk space. Serial ATA is cheap and running out of space is no fun.

The other components were pretty basic - a Dell managed gigabit switch and a Cisco ASA firewall. Good, standard stuff. Little fuss.

bob <– Bob, stress testing the firewall and switch

avoiding those emergency trips to the datacenter

Do yourself a favor and splurge on IPMI or integrated KVM over IP. Sometimes this is built into an onboard card, sometimes it’s just a little extra add-on. It’s really nice to be able to have console access from home without bothering with more hardware, more wires, more crap.

Even better, my machines have this IPMI business which also provides remote access for power off/on, reset, and various sensors. Pretty much all of the management you need and served just the way you want it - over the network, no extra equipment, little extra expense.

It was especially convenient when I was setting the machines up in the datacenter. Just the ability to cable my laptop to a server for management without a keyboard and display was worth the added expense.

My IPMI console: not beautiful, but very useful: ipmi

which flavor of Linux?

Gentoo. Portage (Gentoo’s package system) rules. I don’t want your crummy binary packages. I want the “configure/install from the source” flexiblilty with the easy upgrading, administration, and dependency handling that a package manage provides. Also, Gentoo’s init and filesystem layouts are nice.

Gentoo is my favorite Linux and I’m very comfortable with it. I encourage you use your favorite Linux if that is what you are most comfortable with.

virtualization, baby!

My favorite part. I’m probably driving people nuts with the yammering about Xen and virtualization all the time. In short, I took my two big servers and turned them into 8 “just right” virtual servers. If you want to learn a little more about what virtualization actually is, check out the overview pages over at XenSource.

Okay, why virtualize? My own opinion:

  • Today’s servers are big - you will likely waste your resources if you run one “server” per machine.
  • An system installed directly on your bare hardware is a going to be less flexible, more brittle, and more of a headache to maintain and manage.
  • Easy migration and duplication - if your machines are all virtual you can easily shuffle them around, duplicate them, and so on

I can’t stress how happy I am that all of our server instances are virtual. Management and upgrading is actually fun. Do it. Really.

Pass the Xen, please.

Xen is the king of virtualization. It is a fantastic open source product and it costs you nothing. Three really important to think about before you start:

  • If you need to run any 64 bit virtual machines, Xen needs to be running on a 64 bit Linux and all of your VMs have to run on 64 bit kernels. Remember that you can conveniently run an otherwise all 32-bit linux with a 64 bit kernel. This is what we are doing - 64 bit kernels everywhere but only 64 bit software on the MySQL hosts that need them.

  • USE LVM FOR YOUR DISKS. If you are virtualizing but sticking with traditional disk partitions or image files for your disks, you are losing out on a lot of flexibility and performance. Use LVM2 to set up your disks - the HOWTO is here. With LVM, you’ll have flexible disks that you can create, delete, resize, and snapshot as needed. Want to be able to duplicate one of your virtual servers while it is running and start up a new copy for testing or as a base for a new machine? You need LVM.

  • If you are doing it all from scratch (including building the kernel) like I did, you’ll probably render your machine unbootable a couple times. Have a boot CD handy. The Gentoo Xen HOWTO is quite handy.

I can’t say enough good things about how freeing it is to have a large group of virtual machines that are riding on Xen and LVM. I can move servers to other machines, upgrade software, and do all kinds of other world-changing things with ease.

Xen’s “top” - this box is running 4 virtual machines: 1 web server, 2 app servers, 1 slave MySQL: xentop

bandwidth: our biggest expense

Here are a few tips for not wasting money on bandwidth:

  • Shop around! You’d be surprised how much prices can vary. Make sure to be aware of the reliability/redundancy behind the bandwidth that you are purchasing. In-datacenter bandwidth is often a “safer” blend of several different carriers and that security comes at an added expense.

  • Take full advantage of Amazon’s S3 storage service! If you are hosting any large images on your site, you should seriously consider moving them to Amazon. Take a look at the pricing and think about how it would work for you. It is really cheap.

  • Divide things up logically and don’t skimp on hostnames/virtual hosts. If you separate all of your static content hosts now (even if they are the names all point to the same machine) you’ll have an easier time fiddling with things later. Maybe we went overboard with avatars.ravelry, images.ravelry, images3.ravelry, assets.ravelry, creative.ravelry, etc… but the division has made things a lot easier for me. As an example, we’re looking at buying some cheaper in-datacenter bandwidth from a single carrier and if we decide to do it, it will be trivial for me to offload a bunch of those hostnames to the new connection. It also makes life easier when it comes to usage reporting on all of these different types of resources because the log files are already split out.

  • Later, I’ll write something about stylesheet and javascript minification, gzip compression, cache-related headers, etc etc. Without considering these things, you may be throwing bandwidth away (not to mention slowing down your users).

any questions or comments?

I guess that’s it for server/network related stuff! If you have any questions that are related to the things that I wrote about (curious Ravelry users - I’m talking to you) leave a comment and I’d love to answer in the next post.

useful links!

Next time: next time, I might talk about how I monitor and manage all of these virtual machines :) It’s not much work at all and little ecosystem of management software is pretty interesting.

Comments (16)

  1. Suzanne wrote:

    I understood…NONE of that. But it looks cool!

    Sunday, December 16, 2007 at 8:55 pm #
  2. LOVE Ravelry! This blog is such a great idea!

    Just curious, what was your introduction to Linux and why you chose Gentoo over other source based distros? What Linux websites (if any) do you peruse?

    Sunday, December 16, 2007 at 9:00 pm #
  3. denise wrote:

    My son(18,codemonkey-type)gives you his seal of approval…heh, he thinks we should give him $10K to set-up too. I am glad this blog gave us some bonding time together in Ravelry.

    Sunday, December 16, 2007 at 9:14 pm #
  4. denise wrote:

    PS - He is a big fan of Gentoo, whatever that may be.

    Sunday, December 16, 2007 at 9:15 pm #
  5. Casey wrote:

    @Warrior Knitter:

    I started running Slackware Linux in college because we used GNU stuff (gcc, etc) for our programming related coursework. I stuck with Slackware for a while and then hopped around unhappily (tried Debian, used Fedora for a while) until I found Gentoo several years ago. It’s really important that I have access to fresh (newly released), configurable packages that don’t drag in loads of unnecessary dependencies.

    I know of a few small source based distro projects, but Gentoo is mature and has the biggest and most active developer and user communities as far as I know… Maybe I’m forgetting something?

    I don’t read many Linux specific sites or blogs - I tend to keep an eye on http://del.icio.us/popular/linux and I used to read http://rootprompt.org regularly. Most of the tech blogs I read are related to Linux in some way (as opposed to Microsoft or Sun), so I usually get news and stuff indirectly.

    Sunday, December 16, 2007 at 9:25 pm #
  6. Kathy wrote:

    I feel cool just knowing what SQL stands for.

    Sunday, December 16, 2007 at 10:12 pm #
  7. Betsy wrote:

    Very nice, very smart. So does that mean that all the photos imported from Flickr are hosted on the Amazon S3? Just took a peek at Amazon’s Design Principles. Nice, nice, nice, because it’s all about usability. (I added my husband’s website, as mine is rarely updated)

    Sunday, December 16, 2007 at 11:08 pm #
  8. ashpags wrote:

    Wahoo! A Code Monkey blog! =)

    I am a Linux n00b; I use Redora when I write Fortran & C codes for class, but that’s about it. I can’t wait to learn all kinds of neat things from this blog.

    Hmm…how many times can I tell you that you’re awesome? Because you are. Thanks, as always! =)

    Monday, December 17, 2007 at 12:27 am #
  9. Casey wrote:

    @Betsy:

    So does that mean that all the photos imported from Flickr are hosted on the Amazon S3?

    Nope - at the moment they stay at Flickr. I wish that we had the photos here, but we’re trying to save money. Flickr has policies about copying/storing images and I have to look into how/if they apply when the owner of the images has given permission.

    Monday, December 17, 2007 at 1:01 am #
  10. ashpags wrote:

    Yeah, that’s supposed to say Fedora. I blame it on the fact that I just finished finals, so my brain is still a bit fried! ;)

    Monday, December 17, 2007 at 2:29 am #
  11. Adam Jennison wrote:

    Hi Casey, i have followed the creation of Ravelry for some time (my wife is a BIG fan and has had an account for some time…). I like your set up and your advice on servers, VPS’s etc it all makes good sense.

    My question is really based around the programming of Ravelry. Do you use Ruby and Rails and if so why? Was it a forced issue (scalability?) or was it more personal?

    I for one salute and wish i had started up something like it….

    :o)

    Keep up the good work - though you do give me a head ache as i have to spend the time taking pcitures for my wifes stash!

    Monday, December 17, 2007 at 5:06 am #
  12. eclipse wrote:

    awesome! can i call you guybrush threepwood?

    Monday, December 17, 2007 at 10:00 am #
  13. Casey - you are doing an awesome job with Ravelry! It’s interesting to read what’s under the hood, but I think my favorite geeky thing about the site is your integration with other sites like Ravelry and user blogs. I’m assuming your using some type of web services to do this. I love the fact that you are integrating with other sites, which obviously makes your site more efficient since you are not reinventing the wheel for each function like MySpace does with its proprietary blogs, etc. I also love the interface - looks lovely and works intuitively with Ajax-like controls. In short, great job - keep up the good work! I heart ravelry. :)

    Monday, December 17, 2007 at 11:50 am #
  14. Mary wrote:

    It all seems very sleek and streamlined. I work in healthcare I.T. and you could teach them a thing or three. Healthcare systems are nothing but cumbersome, antiquated, sloggy, buggy headaches….

    Monday, December 17, 2007 at 12:41 pm #
  15. V. nice! I was actually going to ask if this had been put up, but didn’t want to create more work for you. How about the graphics/pretty side - Kudos for the green and white, and how did you decide to roll on your templates? Have you always been a Linux snob or did you start someplace like Amiga?

    Besides the tinkering for fun, what issues are ongoing, and what is being planned for expansion?

    Monday, December 17, 2007 at 3:11 pm #
  16. jennward wrote:

    Casey, thanks for the explanation! As an IT/web person by day, I’ve long wanted to know the details behind ravelry.

    Monday, December 17, 2007 at 11:01 pm #