How 7 Mongrels Handled a 550k Pageview Digging

Posted on January 07, 2008

Update 3/24/08: Fark/reddit strikes, resulting in a 920k pageview day (same setup, just 7 mongrels).

Update 1/13/08: The site was on Digg again; this time receiving 450k views over the course of just 12 hours. Brings it up to 12 requests / second. Again this is not a benchmark, just the actual traffic. CPU usage was never a problem; more than likely the app is RAM bound. Oh and the $400 servers handle many other sites. To me, it’s more important not to have to think about servers, rather then pinching pennies in an attempt to squeeze every last ounce of performance out of machines. Just my $.02.

Over the weekend, this New Christian Science Textbook comic made it to the front page of Digg.

The above traffic graph should give you a sense of the kind of traffic that Digg can drive. It was front-paged sometime early Sunday morning. Chances are, if it hit during the middle of a weekday, the stats would be as much as 30-50% higher.

As you can see, roughly 550,000+ pageviews were logged over the course of 24 hours. Had this been over the course of 2-3 hours, the Incredimazing servers surely would’ve been crushed under the load.

Update note: had it been necessary, which it was not, enabling Rails Page Caching (see below) would have rendered the entire scalability question moot. nginx would’ve become the server from stack to stack for many pages—it’s been benchmarked at 250 to 330 requests per second (10M+ per day) which was clearly unnecessary in this case anyway.

Still, people are always (and probably will always) be wondering, can Rails scale?

The answer will always be amorphous, but at least I can give you some cold hard facts as to withstanding a digging of this nature.

The Hardware

The domain is hosted at LayeredTech on a dedicated two-server (DB + Web/App) setup. Several other sites can comfortably be hosted all on the same setup. The total setup runs about $370 per month.

Web/App Server

  • Cost: $127 / mo.
  • CPU: Intel P4 – 2.8GHz
  • Memory: 2GB
  • Bandwidth: 1500GB (the box rarely uses more than 25% each month)
  • Uplink Port: 10Mbps
  • Hard drive: 80GB x2
  • OS: Fedora Core 4

DB Server

  • Cost: $242 / mo.
  • CPU: AMD Single CPU Dual Core Athlon 3800
  • Memory: 2GB
  • Bandwidth: 2000GB
  • Hard drive: 500 GB x2 in SATA RAID 1
  • Uplink Port: 100Mbps
  • OS: CentOS 4.x X86_64 Bit

Both have dual NIC cards and are connected across a private switch.

Software Versions Used

  • Ruby: 1.8.4
  • Rails: 2.0.2
  • MySQL: 5.0.27 standard
  • nginx (web server): 0.4.13 (built by gcc 4.0.2)
  • mongrel: 1.0.1
  • mongrel_cluster: 0.2.1

nginx – the Little Webserver that Could

nginx is a fantastic little web server developed by Igor Sysoev – and despite taking a hammering like this one, its memory usage rarely got above 10-20MB (total).

If you’re still not convinced (not that I’m trying to win converts here), Ezra Zygmuntowicz, author of Deploying Rails Applications, recommends it highly (and uses it extensively at EngineYard ).

Of course, Apache 2, Lighttpd, etc. would most likely have been able to handle the load similarly, so long as they were setup / configured / etc properly.

Rails 2.0 – Better ActiveRecord, etc. Performance?

I have no scientific evidence supporting this, but since making the switch to Rails 2.0, my sites have seemed a lot more stable and zippy. (this blog excluded – it sits lower on the totem pole!)

I believe some benchmarks of Rails 2.0 have shown some modest improvements over 1.0. My recommendation: update early & often—in about a day several of my sites were converted over to 2.0 with minimal issues.

Mongrel – The Legacy of Zed Shaw

1/24/08: Hmmm… the rest of this article must’ve been nuked somehow while editing it in Mephisto. Sorry!

Comments
  1. YaacovJanuary 07, 2008 @ 10:07 AM

    Great point on how little servers actually cost. I had a landscaping business and my truck alone cost me over $500 a month. Laying out $100 a month or ever using EC2 should be a starting point for any new developer.

    Say no to shared hosting!

  2. thomas lacknerJanuary 07, 2008 @ 11:20 AM

    You also have to consider the kind of page views that are involved; if there is dynamic graphing, lots of database access over many gigabytes of data, commenting, etc., 550k hurts a lot more than viewing a few images.

    ObDisclaimer: I’m definitely not trying to minimize your achievement or this post, because I think reports from the battlefield of high traffic websites are very important. I’m just trying to broaden the conversation.

  3. Shanti BrafordJanuary 07, 2008 @ 11:46 AM

    thomas – very true indeed.

    The ‘view image’ page made about 6 relatively inexpensive SQL calls. Adding a few more for comment threading probably wouldn’t have been too bad, for a young site.

    Once a site has as many comments & content as digg/reddit/etc, I’m sure it’s much more painful to scale to 550k!

  4. Hugues LamyJanuary 07, 2008 @ 09:11 PM

    Many thanks on the description of your configuration. I was guessing that Mongrel (about 4 of them) + NginX were the best RoR configuration possible, but I could not find any real operation data to back this. I’ll keep this blog handy.

  5. TrophaeumJanuary 08, 2008 @ 03:00 AM

    Sorry but anything under 100 requests/second for a dual server setup like the above is nothing special and if rails isn’t able to realistically hit this point then isn’t this a bad sign?

    NginX and fastcgi php can handle this without breaking a sweat, done it many times before

    not impressed

  6. LyndonJanuary 08, 2008 @ 03:04 AM

    Why does the DB server have more bandwith than the App server? Surely the DB server uses unmetered LAN bandwith, never taking requests over the internet?

  7. Shanti BrafordJanuary 08, 2008 @ 03:58 AM

    @Trophaeum – the article simply laid out the facts as they happened.

    The goal was never to test ruby, rails, mongrel or nginx beyond the actual traffic that the site was getting. See my latest post on economics of scalability—at $3 CPMs, this setup costs less than 1% of revenues that it can comfortably support. (who knows, the setup probably could’ve handled 1-2M pageviews in a day if that much traffic was really thrown at it, and steps were taken to adjust appropriately)

    BTW – your site is currently showing this error on its homepage:

    Strict Standards: Declaration of PropelPDO::prepare() should be compatible with that of PDO::prepare() in /var/www/vhosts/trophaeum.com/site/propel/runtime/classes/propel/util/PropelPDO.php on line 0

  8. JoeJanuary 08, 2008 @ 07:13 AM

    You link to layeredtech.net but it should be .com

  9. Peter CooperJanuary 08, 2008 @ 08:11 AM

    Interesting to see the numbers! A couple of things stick out though..

    1) The uplink on the DB server is 100Mbps but on the Web server only 10Mbps? Seems back to front from what you’d usually need (especially as you have a private network, so front end uplink isn’t key).

    2) Your provider provides pretty low specs for the rates. Either that, or you’ve had these servers quite a long time, back when those were reasonable rates. For the past couple of years I’ve used SoftLayer (though I still have one over at ThePlanet) and I have a quad core 2.4GHz / 4GB RAM / dual 500GB drive machine for about $220 a month (with the private networking, CPanel, 100Mbps uplink, 2TB bandwidth, etc).

    3) You say page caching was unnecessary in this case, but also suggest that if the load came over a 3 hour period, rather than 24, things would have got ugly. Looking at the page in question, page caching would have probably taken away 99% of any potential load issues even if a million pageviews hit you in an hour to that page. (Even doing a hacky type “self” caching thing by making the folder and putting in the HTML file yourself once you noticed the Digging would have worked)

    Anyway, you prove your point well. Good post :)

  10. John January 08, 2008 @ 12:05 PM

    Just a friendly warning, we used to host at layeredtech and have recently left due to bad support. We had a disk crash and they made several errors in the recovery. Unfortunately, this is not the first time we’ve had problems, so we’ve moved our production servers.

    This highlights a problem with all co-location, server farms. Good connectivity, bandwidth and server configuration are normally what we look for in a hosting provider. What’s really important is the level of service when it goes wrong.

  11. Shanti BrafordJanuary 08, 2008 @ 03:02 PM

    @Peter – I will have to checkout SoftLayer.

    Yes, these servers are indeed very old!

  12. Alex PopescuJanuary 08, 2008 @ 03:36 PM

    Interesting post. I am still a bit confused what are the dynamic parts of the page in question. (Or are these numbers related to the whole site? In this case a distribution of that number would be interesting too. ). It a quick look, that page looks pretty statical, and so serving it through a web server would be extremely quick. I am pretty sure there must be something dynamic into it.

    cheers, ./alex

    .w( the_mindstorm )p.

  13. Shanti BrafordJanuary 08, 2008 @ 06:08 PM

    @Alex -

    that was the page that was originally dugg, but as you can see people then clicked around on the site quite a bit. That page got about 80k views, similar ‘show image’ pages probably around 300-400k.

    The ‘Home’ view which pulls the top images, along with the ‘Popular’ view, received the rest of the views.

    The issue with whole Page Caching, is that the site also offers the ability to login, upload images, etc. It would be fine (to page cache) certain pages under heavy load, but generally the mongrel/rails stack needs to be hit in order to determine if a user is logged in or not.

    Queries used for each image pull: ContentItem pull, tags pull, ContentItem.user pull—these can be lumped into a single query using rails’ :include option of course.

  14. AnĂ­bal RojasJanuary 09, 2008 @ 06:05 AM

    Anyway coding the appropiate location directives in the nginx.conf to resolve directly to public/ and adding a little bit of caching surely will improve performance, maybe up to the level of getting rid of one server without much effort. Is this change really required? It just depends on your economic model. priorities, etc.

    Off Topic: Please join RubyCorner.com, a directory for blogs related to the Ruby Programming Language or any of the related technologies and projects, you will help to build a stronger community.

  15. AyyanarJanuary 17, 2008 @ 12:53 AM

    Which site you are talking about? Is it digg.com where you tested the scalability? I think digg.com is done in PHP