Here’s why I do: it’s time for me to build a new master database server. Our current main slave is too underpowered to be handle our entire load in an emergency, which means that our failover situation isn’t that great. I’ll replace the master with something new and shiny, make some performance improvements while I’m at it, and the old master will work just fine in an emergency.
For IO intensive servers, I conserve space and electricity by using 1U machines with 6 or 8 2.5″ drives.
I’d normally buy 8 Seagate Savvio 15K SAS drives and set them up as a RAID 10 array. This would run me about $1850.
We’re pretty frugal when it comes to our technology budget and I can’t really stomach spending that kind of money to effectively get 550 GB of redundant, fast magnetic disk storage. SATA MLC SSDs that blow traditional drives out of the water are currently under $2 / GB.
This is a collection of information that I’ve used to inform my decisions. I don’t know what I’m doing, so I don’t want you to take my word for it (seriously) – I’m just hoping that this collection of links will be useful to some people.
Also, this plan might make no sense to you depending on your situation. We buy 1 or 2 servers a year and saving a thousand dollars is a big deal to us.
One more thing: Today is May 9th, 2011. The SSD universe is expanding quickly and this post will likely be obsolete in a matter of months.
Should you buy RAM instead?
Yes. Increasing the size of your InnoDB buffer pool is the best way to speed up MySQL. If you can add more RAM, do it.
It costs $1000 to buy 48 GB of RAM. If your working set (your hot data) can fit into RAM, you probably don’t need to bother with SSDs at all.
Which SSD to choose?
The Intel 320 Series.
It’s an MLC based, Serial ATA SSD in the 2.5″ form factor.
- Same price as magnetic disks: the 300 GB version is $540. This is the same price (!!) as the soon to be available Savvio 15K.3 300 GB SAS hard drive.
- Same level of relability as magnetic disks: This drive is more reliable than the X25-M which I have had great success with. Check out this marketing slide with failure rates from over 1 million deployed X25-M units.
- This device includes power-loss data protection. Many SSDs can lose the data that is in the process of being written in the event of a power loss. This is very bad. (Intel PDF link)
- This is the 3rd generation of a proven piece of hardware. I feel very comfortable choosing this Intel device and feeling comfortable is good.
- Intel has published spec information for server applications that addresses my write endurance concerns. The biggest problem with using MLC based SSDs is that the number of writes that they can handle over their lifetime is drastically smaller than what an SLC can do. This specification PDF gives you some information as well as documentation on the SMART attributes that you can use to predict the life span of the drive given your load.
You might say…
Q: MLCs are bad, shouldn’t you be using an SLC drive like the Intel X25-E.
A. The X25-E is $10 a gigabyte. Even if I weren’t trying to save money, I can’t see spending that much on a 3 Gbps Serial ATA drive. This MLC would be a bad choice for me (vs SLC) if it wouldn’t be able to meet my write needs… but it will.
Check out the PDF mentioned in the last item above. As an aside, Intel’s strategy with this drive is a little strange – they clearly intend for it to be used in server environments but it doesn’t appear to be marketed that way.
Q. Why not choose a PCI solution like Fusion-io ioDrive or Virident TachIOn?
A. The Virident TachIOn looks like a fantastic piece of hardware – check out this benchmark/post on MySQL Performance Blog.
Unfortunately, the 400 GB TachIOn is over $13,000 and I’d probably need the 600 GB model. These are for people who have a serious need for hundreds of gigabytes of persistent flash storage. If this were my only good option, I’d stick with 15K SAS disks.
Q: Why Intel? There are much faster SSDs on the market. The 320 isn’t even a 6 Gbps drive
A: I’m not convinced that it matters much for currently available versions and forks of MySQL, but even if it did – these other vendors can’t really touch Intel’s reliability. I researched the market before this Intel drive was released (> 5 weeks ago) and I had tentatively decided that I wouldn’t be able to use SSDs at all.
Your RAID controller might matter
Check out this slide from Yoshinori Matsunobu’s (highly recommended!) “Linux and H/W Optimizations for MySQL“
Scary, huh? I tend to use LSI’s low profile battery-backed MegaRAID controllers. I did a little looking to make sure my usual controller of choice (LSI 9261) would still be a good choice and I found something interesting. For about $100, I can get a software upgrade called “FastPath” that improves performance when RAIDing SSDs.
Neat. I haven’t really heard about this product much and it could be snake oil for all I know but I figure that it’s worth a try.
LSI has a fancy new controller that claims it doubles the IOPS of the 9260s when used with SSDs. It’s about $700 so by the time you add battery backup and the FastPath software, you’ve paid $1000 for a RAID controller. Too rich for my blood, I’ll stick with the 9261 for now.
The Hardware Configuration
Here’s what I’ve purchased:
- Supermicro 1016T – my favorite server building block. 1U, eight 2.5″ drives, Xeon 5600, 12 DIMM sockets, redundant power, built in management
- 6 x Intel 320 SSDs configured in RAID 10. SSDs aren’t mechanical, but they can still fail.
- 2 x Seagate Savvio 15K SAS drives Yup! Trusty Savvio SAS drives – these guys are going to handle the bulk of the sequential writes (doublewrite buffer, transaction logs) since that’s what they do best.
- The LSI 9261 with FastPath software upgrade
I’ll keep this short:
- Use the latest kernel
- Do not use the default IO scheduler. CFQ is a little slower with MySQL on regular disks as it is! Change the scheduler to
- Use the best filesystem that you can (XFS or EXT4) Someday we’ll have ZFS or SSD optimized filesystems on Linux, but we don’t have either today.
- You probably want to disable your filesystem’s access time and write barrier /write through options (
If you are going to run MySQL on SSD drives you MUST use Percona’s XtraDB
I usually recommend MariaDB but they do not yet have a release that builds on MySQL 5.5. I may use Percona Server myself.
Put your logs and your doublewrite buffer on traditional hard disks
These files are sequentially (and heavily) written – they are not a good fit for SSD and they are a good fit for traditional disks. The doublewrite buffer location is configurable in Percona Server. The setting is:
Here’s a longer explanation with benchmarks: http://yoshinorimatsunobu.blogspot.com/2009/05/tables-on-ssd-redobinlogsystem.html
…and here is the TL;DR
Tune InnoDB internals
innodb_adaptive_checkpoint = keep_average
innodb_flush_neighbor_pages = 0
You’ll probably want to tune
innodb_io_capacity as well:
Um… Yeah, I’m just going to buy more RAM instead
I’ll leave you with recommendation: consider trying out SSDs on for slave databases. They are especially awesome for backup slaves because any old box and a single SSD may be fast enough to replicate your MySQL database. It’s really nice to have a slave that is dedicated to only backups and with SSDs, doing this is so cheap that there are few reasons not to.