Our Internet Office

Like many little web businesses, our staff doesn’t all work in the same office. Jess and I work from our office or our home in Boston, Mary-Heather works from Albuquerque, and Sarah works from Houston. There are lots of amazing tools that help us work together as a team and feel more connected and less isolated. Here are some of the things that we can’t live without:

Campfire and Propane - team chat

What would we do without team chat? Campfire is the web-based group chat that keeps our team connected throughout the day. With chat, we don’t feel like we are working alone, thousands of miles apart.

There are many web-based chat services but we like Campfire. It works well for our needs and there are some great ways to extend it. Propane is an excellent tabbed native chat client for Macs that we use when we are at our own computers. Ember is a Campfire client for iPhone… and we integrate Campfire with the rest of Ravelry through a Ruby API (Tinder) and built-in connections from our help desk / email support software.

Communicating without distracting

Unlike IM, having a team chat gives us a way to have the a communications channel that doesn’t always demand immediate attention. We can leave notes for others to see when they have a chance, jump into a full on chat or meeting when we need to, and page one another visually by taking advantage of the Growl notifications that Propane offers.

Email bad

The last thing that we need is more email. Chat has given us a way to share information and discuss things without adding to our already-overwhelming email load.

Integrating notifications

Ravelbot, our automated chat robot, pops in to let us know about things that have occurred in Ravelry or in our customer service email that we may want to know about. We don’t want to have to check screens of important information or run reports but we also don’t want to deal with email notifications.

By using the different APIs that Campfire offers, we can connect our own software to chat so that we know about important events as they happen.


Skitch - easy screen capture sharing

We love Skitch so much. A screen capture application doesn’t seem like a big deal but there are a few features that make it indispensable to us:

Upload your capture in one click. We share screenshots of things in chat so much that it really is important that it is easy. Talking about things is so much faster when you immediately share a image from the site, someone else’s site, email, twitter, whatever.

Upload to your own host. Since Skitch can upload any image (not just a screen capture) in one click, we save a lot of time by using it to store images that we post on the site, in forum posts, in blog entries, and in other places. When we do these things, we definitely want to store the images on our own server for more permanence and less bandwidth concerns. Skitch can upload to any host via SFTP, FTP, or WebDAV.

Easy annotating and drawing. Once you’ve got an image, it’s really easy to add arrows, text, draw, do whatever you want to do before you share with someone else.

Our Campfire chats are peppered with screenshots and annotated images and it really makes communicating a lot easier and a little more fun.


Dropbox - file sharing that feels local

Shared folders - sometimes you need them. Dropbox is the best file-sharing application that I’ve ever used. The client works on Windows, Mac, Linux and provides you with a real shared folder on your machine that looks and acts like a normal local folder. The web site works well when you aren’t at your own computer. There are RSS feeds and Growl notifications for updates. There is a revision history in case you need to recover previous versions of a file…

Most importantly, the syncing in the background is transparent and seamless. A copy of your files are stored on each machine that is configured to use Dropbox and the application takes care of the syncing for you in the background. If you aren’t connected to the Internet, you still have your files, and if you make changes they are synchronized when you next connect.

Email - Helpspot

Gmail is one good way of easily sharing general email boxes and linking to individual messages if you want to talk about them. We started out using shared IMAP accounts (regular email accounts) because all of us have mail clients that we prefer to Gmail - Pine for me, Apple Mail for Jess and MH. As we’ve grown, it has become harder to manage the volume of email and harder to keep track what is happening with all of the individual users and other people that we communicate with.

We now use Helpspot for most of our general mailboxes. Helpspot is a web-based email app that is aimed at customer support. Helpspot helps us work together in a few ways:

  • We can share links to messages when we need to talk about them
  • Much like Gmail conversations, we can see history so that we can easily pick up on an existing conversation
  • I’ve integrated the Helpspot email with Ravelry itself so we can all see email histories in relevant places - this helps because a previous conversation that someone else had with a person may be important and this way we can know what is going on without asking (or discovering later)
  • Helpspot has a Campfire API that we use to automatically push certain messages to chat so that we can all see and talk about certain types of issues/questions

…and I ran out of steam. I’ll just post this now before I relegate it to the draft pile. I suppose I don’t really need to talk about Google Docs or Skype anyway :) Hopefully someone will find these things useful or interesting.

Is there anything that your office can’t live without?

edit One little thing - we’ve been using oovoo for 3-way video chats. iChat seemed too bandwidth intensive and Skype only does 2-way video. oovoo has an ugly client but the service works pretty well.

Tasty Links / April 2009

A collection of interesting and useful stuff that I collected during March - April. I’ve added a few comments to things that I’ve tried out or looked at further…

  • Ruby GC Tuning - handy GC tuning info with performance notes from Evan W. at Twitter. I adjusted part of Rav and got big response time improvements and CPU usage improvements that were in line with the numbers that Evan posted. I posted a little graph in his comments.
  • Tokyo Cabinet and Tokyo Tyrant - a super fast key/value store and memcached-speaking server component (respectively). This came out of Japanese social network mixi.jp. Tokyo is file-backed but is nearly as fast as memcached for most needs. We’ve been using this for the last couple weeks to store long-lived cache data that is a little big for memcached. See also this presentation on Scribd and Plurk’s built-on-Tokyo LightCloud.
  • Facebook’s growing infrastructure spend - really interesting post from Niall Kennedy. The numbers showing Facebook users vs % of people with internet access per country are staggering.
  • Kindle 2 stuff - Jess recently got a Kindle and she is addicted to it. Some neat tricks: change the screen saver, build in conversion of PDFs and epubs
  • For all Mac users who are interested in Ruby - check out the 2009 Rubyist’s guide to a Mac OS X development environment from the robots over at Thoughtbot.
  • Sly - I haven’t really been following the Javascript selector speed wars lately but this thing looks crazy fast (well, circa April 2009 crazy)
  • Speaking of crazy fast - the new Intel Xeon 5500 Nehalem processors look pretty awesome. Faster, lower power consumption, DDR3 memory with a faster bus, Power Boost (appears to be an auto overdrive/overclock?), Hyperthreading is back. This is a real upgrade and not just more cores and more GHz. I can’t turn up any nicely done benchmarks right now but here is an overview, PDF data sheet is here, and the Silicon Mechanics configurator will give you some idea of how much these things cost.
  • OmniGraphSketcher - finally, nice Mac OS software that you can use to turn raw data into nice looking graphs. I’ve only tried it on simple things so far.
  • Realtime Twitter Search Results for Google - this Greasemonkey extension inserts Twitter search results into the top of your Google searches. Install it immediately. The realtime aspect changes everything - having this information inserted into my normal googling has been immensely useful and interesting.
  • Not purely technical - but kind of amazing. Individual musical performances in YouTube videos layered to create new audio & visual compositions. I’d call them something simpler like “mash ups” if I could bring myself to use that phrase ;) THRU YOU | Kutiman mixes YouTube.

That’s it for this month!

Quick update: Ravelry runs on

Just a little something to get the ol’ blog rolling again.

Last March, I wrote a tiny post that listed a few of the pieces that run Ravelry along with my opinion of each. Here is a update to that list. The biggest change? I’ve been running Phusion Passenger + Apache as application servers for the last several months.

As of March 2009, Ravelry runs on….

  • Ruby on Rails 2.2: Still love it.
  • Phusion Passenger: Replaced Thin/Mongrel. Passenger plus Ruby Enterprise Edition uses less memory, avoids memory leaks, and it spawns and retires Ruby processes as needed. I also like that you can get a fair amount of status and debugging information if you need it. It looks like Passenger is the new standard for Ruby deployment. The only downside? Hot deployments of new versions of the app were causing lots of trouble for me (Google Groups thread here). I ended up putting together a Capistrano script that removes each Passenger from load balancer, updates and stops/starts Apache once it is cold, puts it back into the loop, and moves on the the next app server.
  • nginx: Still rocks. fast, memory efficient, stable, trouble free. We’re running two now.
  • haproxy: Still rocks. I’m using it to load balance HTTP requests across Apaches and provide failover for search and another service.
  • Sphinx: Still love it. If your site has any search needs, you’ve got to try it.
  • beanstalkd: beanstalkd + a few simple Rails runner script daemons are my new solution for running background jobs and processing file uploads. This is a huge improvement over backgroundrb: much simpler, much more stable. I’ve never had a problem with beanstalkd. Background jobs used to be a very brittle part of our architecture and now I rarely think about them.
  • Percona MySQL 5.0.x build: MySQL 5 has been rock solid for me. MySQL 5.1 was not. I’ve been running MySQL 5.1 on my slaves and I’ve found that they do a better job of picking up replication after a restart or other problem. Although I’d love the fast index creation feature that the InnoDB plugin provides, I have no plans to upgrade the main database to 5.1. However, I would like to bump up to one of the newer OurDelta MySQL 5 builds.
  • Other stuff: I’m still very happy with Nagios for monitoring, Munin for perfomance graphs, Postfix for mail, Xen for virtualization, and Gentoo as my Linux of choice.

Updated to add a little diagram that shows how the web/application tier fits together.

Upgrading from Rails 2.0 (ish) to Rails 2.2

Look out - this post is going to be a bit dry.

Over the Thanksgiving weekend, I upgraded Ravelry from a crusty old nearly-2.0 version of Edge Rails to the latest 2.2.2. Other people’s upgrade war stories were really helpful so I figured that I’d post my own. (See also gusg.us’s 2.2 story, assaydepot’s 2.1 story)

Here are the things that had to be fixed or tweaked:


environment.rb
I moved a bunch of stuff out of here into the config/initializers directory. Much neater. I also config.gem‘d my gems.
cache_fu plugin
Broken. Upgrading to the latest source from github made it all better.
memcached gem
I use Evan Weaver’s juiced up memcached client. Upgrading to the latest source from github made it all better.
xss_terminate plugin
I’m using xss_terminate for sanitizing input and I needed to add a little patch to an initializer to make the plugin work properly: http://code.google.com/p/xssterminate/issues/detail?id=3#c0
exception_notification plugin
Wasn’t working right, but I’m using HopToad now so I just trashed it.
acts_as_versioned
I read that weird stuff would happen with older acts_as_versioned and the new dirty attributes/partial updates. Upgraded to the latest source. Unfortunately, the newer version seems to handle version columns differently. I was previously using “acts_as_versioned :version_column => ‘lock_version’” to base the “version” column in the versioned table off of the model’s “lock_version”. Now it wants both to be called “lock_version”. I just patched it for now.
RFPDF plugin
Broken. Changed init.rb from ActionView::Base::register_template_handler ‘rfpdf’, RFPDF::View to ActionView::Template::register_template_handler ‘rfpdf’, RFPDF::View
rails_asset_id in ActionView::Helpers::AssetTagHelper
Gone. Reworked some helper code - I didn’t need it anyway.
attachment_fu
Broken by ActiveRecord changes. Here is the fix: http://ar-code.lighthouseapp.com/projects/35/tickets/25-edge-callback-overrides-inconsistently-pass-arguments
active_form
More small breakage. Here is the fix: http://github.com/valda/active_form…
has_many_polymorphs plugin
Oh god - has_many_polymorphs was a nightmare (as expected). I upgraded to the latest and then had to fix broken stuff for a while. I don’t know what version I was using before but stuff changed. ack.
prototype.js
Sadly, I’m still using a pre 1.6 version of Prototype and now was not the time to upgrade that as well ;) I had to implement Element.Insert in Javascript because the RJS stuff was changed to use that instead of Insertion (which was deprecated) You can see the change that I’m referring to here: http://github.com/rails/rails/commit…

That’s it! Also, HopToad was a great help in finding a few small bugs and glitches once I pushed the upgraded version out to production.

Now what?

I’m going to go back to fun Ravelry-improving work for a while. Still, I’d like to move to Git. …and maybe take the latest JRuby out for a test-drive. The total amount of memory that our Mongrels (Thins) consume is pretty embarrassing and things are heating up in the JRuby world with faster releases, a more threadsafe Rails, and so on…

Toolbox photo from Flickr/mamabarns .

Tracking down some application problems with Ruby on Rails, MySQL

I took care of a couple long standing issues this week and I wanted to quickly talk about them.

Tracking down Ruby/Rails memory leaks

Memory leaks are no fun. We sprouted a big leak about a month ago: Ruby VMs would get all bloated, often hitting 300-400 MB and sometimes consuming even more ridiculous amounts of memory.

I run about 20 instances on each virtual server and a bunch of misbehaving instances could quickly eat up all of the memory. Because restarting my mongrels (Thins, I mean) when they consume too much memory is pretty painless, I put off fixing it for a while.

This week, I tried using bleak_house to figure out what sorts of objects were responsible for my VM bloat. I found nothing.

Then I learned something really important about Ruby’s garbage collection and heap maintenance. I don’t know how I missed out on learning this earlier. I’m going to quote a bit from hacking on ruby’s garbage collector at lloydforge.org since he explains it well.

  • First important thing: “When ruby runs out of heap space, it first does a GC run to try to free something up, and then allocates a new heap. the new heap is 1.8 times larger than the last.”
  • Second important thing: “Because of the way ruby works, objects may _never_ be moved around in heaps. That means from the time they’re allocated to the time they’re freed they may not be moved to a new memory address”

You see where I’m going with this? Force ruby to build bigger and bigger heaps and you’ll probably never get that memory back. A single non-garbage object sitting on one of this heaps will keep it alive prevent you from getting that memory back. There is no heap compaction.

I read about some of the work that the Phusion guys had done on their ruby fork called “Ruby Enterprise Edition” and noticed that they had added a handy debugging tool: ObjectSpace.statistics.

I installed Ruby EE, added some ObjectSpace.statistics logging to each request and ran it on one server for a half-day. Once I had my results, it only took a few minutes to spot the problem requests and fix them. Phluid’s Ruby EE prints out the number of objects, size of objects, number of heaps, and size of heaps so you can spot when a request is causing the creation of a new (large) heap. The culprit was just a stupid coding error in a rarely-hit action that loaded way too much data via Activerecord. The heap would balloon to a gigantic size and although the objects loaded would all become garbage, something prevented the memory from being given back.

This was a heap-related problem but I think that these ObjectSpace statistics would be equally valuable for finding the reference-related leaks. You probably wouldn’t be able to eyeball the data like I did but you could add up the growth in # of objects and bytes across a large number of requests and come up with a few suspicious actions. The really nice thing is that I was able to set up this logging in production without impacting the site in any way.

Tracking down MySQL performance problems

If you are running MySQL, you should consider using one of the Percona patched versions: http://www.mysqlperformanceblog.com/2008/07/16/mysql-releases-with-percona-patches/

The Google SHOW INDEX_STATISTICS patch is great - you can use it as part of your index tuning and pruning because it will show you which indexes are being hit.

The microslow patch turned out to be an extremely valuable tool for finding problem queries. We have far too much query traffic to log all of it and MySQLs “slow query” log only works as whole second precision. In our case, we had some slow queries that were beating up the IO subsystem and they weren’t easy to spot until brought up the patched MySQL and set the threshold to 300 milliseconds (instead of 1 second).

I ended up breaking up a very large table into an InnoDB table with all of the important information and indexes and a MyISAM table containing the blobs (text columns). I changed a few lines of code on the application side to make things work with the new split table, and voila:

okay. I’ve run out of steam…

In other news - we hit a milestone of sorts by nearly making the Alexa 10,000 this month. We’re at 10,032 ;)

Beta testing and beyond

Ravelry is still officially in “beta”. This word has been drained of most of its meaning in today’s web but here is what it means to me (and hopefully our users): the site is being very actively developed and things will change, unfinished “trial” features will be introduced, there will be bugs. In the last 365 days, there have been 241 releases where a new version of the code is pushed out to the production site.

As an aside - I’m looking into doing away with the beta label and having beta testing being an opt-in thing for users who want to be guinea pigs. We’re approaching 150,000 registered users and not everyone realizes that they often part of an experiment - this ends up creating more work for us in the form of questions (and sometimes, complaints)

Our users have been incredible - they report bugs of all kinds, come up with tons of great ideas and suggestions, enter things into our issue tracking system, and comment on existing issue. Here is how we do it:

Collecting feedback

Some sites collect feedback from beta testers with an email form. Don’t do this. All of your users need to be able to see and read the bug reports and ideas that everyone else is sending in. Working as a group is more efficient (less duplication), more productive (because small what-ifs can turn into great ideas through conversation) and more fun.

We set up one board in our forums that is specifically for talking about Ravelry itself. The board isn’t limited to feature requests and bug reports - it is a place for any kind of Ravelry talk like “Who has the biggest stash on Ravelry?”. In the last 30 days, 2300 people have posted on this board and 18000 people have read this board. hm… I guess I have to come up with some ideas to drum up participation - people who aren’t reading are missing out on their chance to shape the site.

Keeping Track

Our Ravelry discussion board contains 97,000 posts. We can’t use the board itself to help us track, categorize, and update all of the bug reports and suggestions. Even if we tried to do it by keeping the threads themselves excessively organized and moderated, we’d be sure to lose things.

So… we have an issue tracking system (The List) built in to Ravelry. Jess, Mary-Heather, myself, and 30ish other Ravelry users called “trackers” have the ability to turn posts in the forums into new issues in our system. When people post bug reports or suggestions we all try to either add them to The List or connect them up to existing entries. You can see that the post below has 1 agree vote and 1 disagree - sometimes the agree/disagree votes that are part of our forums can be helpful when we are looking at suggestions.

Now what?

When I am not actively working on them, the list items serve several purposes:

  • Users can search The List to see if their suggestion or bug report has already been brought up
  • Trackers can connect future forum posts to list items in case people have more or different information to share. This happens a lot and it is handy to have slightly different bug reports attached to bugs and further thoughts attached to suggestions.
  • All users can comment on items and I read these comments when I set out to work on something.

Although I do have some basic prioritization type features to The List itself, I do all of my work-gathering and organizing outside of Ravelry. I tend to go into the list, gather a set of items that fit with what I am working on, and take those back to my virtual work area so that I can sort through them, read comments, and organize.

I’m really happy with this. I enjoy talking with Ravelers about the site and getting ideas and help from them. I’m also very glad that I don’t have to worry as much about losing good ideas and bug reports in a sea of posts. Ravelry Users: You can find the For the Love of Ravelry (FTLoR) board and a link to The List on the forums tab. As always - if you have ideas on how we can improve tracking and The List, we’d love to hear them. Just make sure to post them in FTLoR so that we can track them ;)

Friday notes

Just a few assorted things that I thought I would share:

MySQL 5.1 rc

I’m still having a problem with MySQL 5.1 hanging. I removed the innodb plugin as an experiment and now I’m pretty sure that it is a bug in MySQL. Not only is it suspicious that the same workload never caused deadlocks in 5.0, I’m pretty sure that either innodb’s lock detection or the fallback lock_wait_timeout should be kicking in.

I’ll probably downgrade to 5.0 soon. The before/after performance graphs will be interesting.

Reading

This week I read Clay Shirky’s “Here Comes Everybody” and Josh Porter’s “Designing for the Social Web“. Shirky’s book was great - try out some of his writings on media and community if you want a (free) taste. Porter’s book was more of an ego booster - sort of a catalog of lessons that we’ve already learned. If you are working on a web site that will have human users, do pick it up. The book is concise and well organized and you’ll get *something* out of it (I did).

Bookmarked this week

I had it coming…

Just to recap - various stuff that runs Ravelry and my opinion on each:

  • Ruby on Rails 2: Love it. Moving to 2.1 for the named scopes (and because I’m on Edge somewhere between 2 and 2.1)
  • Thin : stable. A little nicer than Mongrel.
  • nginx : rocks. (ie. fast, stable, does what I want, trouble-free)
  • haproxy: rocks (ditto).
  • Sphinx : rocks (ditto).
  • backgroundrb : the new and different version ain’t bad. Give it a try.
  • Ferret / acts_as_ferret : phased out. Kind of a pain, slow to reindex. The DRb business made things shaky at times (ie. ferret server goes down, badness ensues).
  • MySQL 5.1.24 (rc2) and the innodb plugin: argh. Had deadlocked for no apparent reason, killing processes doesn’t help, MySQL hangs, must kill MySQL :( This has happened 3 times in the same number of weeks. MySQL 5.0.something worked great for a year but I’m a hopeless magpie sometimes. On the plus side, MySQL 5.1 seems to recover faster and slaves don’t lose their minds during the kill/restart. heh. I’m blaming MySQL and not the plugin (just a hunch, I have no evidence). In any case, wait for the next release.

    Next time I’ll have to grab processlist and innodb status output so I can actually try to find/report the bug. Doh. I started up Maatkit’s mk-deadlock-logger, so hopefully that will catch it…

New Database Server

Well, I installed our new database server last night and so far things are looking good. The new box has way faster disks (4 SAS disks that are RAID10′d) and 32 GB of memory. You can’t scale up a database forever, but now I’ll actually be able to see through the smoke and tune my app more easily. We just didn’t have enough memory before and the database spent tons of time waiting on the disks as data was moved in and out of the InnoDB buffers. Now we’ve got 1 good database machine, so the next architectural upgrade might be moving the read-only slave to another good machine. (pssst - great scaling presentation over here)

There was a lot of shuffling involved in the move. I set up a brand new VM, dumped/restored the database, moved 2 application virtual machines from our smallest box to the space previously occupied by the database, moved the slave database and search engines (Ferret, Sphinx) to the small box, and then redistributed all of the memory and started it all back up. Have I mentioned that I love Xen? :)

A few tools that made my life much easier:

  • Maatkit - the parallel dump and restore tools are very fast and it is easy to monitor progress. The majority of the downtime during the move was going to be during backup/restore and Maatkit saved us precious minutes.
  • dd over netcat - an easy low tech way to move my machine images between physical servers. I also made use of ReiserFS’s filesystem resize tools so that my disk images were as small as possible. Google for “dd netcat
  • mysqlbinlog - because I forgot to check to see where I wanted the slave database to start replicating from.
  • …and thanks to Google and a blogger named Bruno for helping me get my Xen dom0 memory back without rebooting.

Also, I’m now running MySQL 5.1 (the newest release candidate) and the new InnoDB plugin. The plugin replaces the version of InnoDB that ships with 5.1 and offers additional features such as online index updates (woohoo!).

A huge thank you to all of our users - this machine was 100% paid for with donations. Thank you so much everyone! (..and also thanks for being patient during the last 4 weeks or so)

MySQL 5.1 (finally?)

After years of development, it appears that MySQL 5.1 is going to be released at the MySQL conference next week.

The timing is perfect. Our new database server is arriving on Monday and there hasn’t been a release candidate release since January.

5.1 is supposed to be faster. I’m sure that the web will be awash with benchmarks but I’m still going to run some sysbenches on the new machine just for fun. I’ll post whatever I come up with. In addition to performance improvements and bug fixes, MySQL 5.1 has some new features (like partitioning). There is a short “What’s New?” page over at mysql.com: http://dev.mysql.com/doc/refman/5.1/en/mysql-nutshell.html.

update Doh. Looks like no GA release until June. At least there is a new release candidate.