Configuring Debian for Ruby on Rails
May 15th, 2007
After configuring server after server over the years, I've settled on Debian as being my "distro of choice". Some time ago, I started keeping track of the commands I'd routinely type when setting up a new server and those commands have morphed into a Perl script which I now use instead. This script is designed to take an out-of-the-box Debian 4.0 standard installation and turn it into a lean mean Apache-Mongrel-Mysql-Rails serving machine with minimal effort.
Getting and using the script
This script assumes you have a standard Debian 4.0 installation (with networking already setup) and that you are logged in as root. If you choose to use this script at your own risk, type:
debian:~# wget http://svn.bountysource.com/fishplate/scripts/debian_install.pl
debian:~# perl debian_install.pl
Enter full hostname (example: server1.yourdomain.com): server1.mydomain.com
Install ruby, rails, gem, etc (y,n) [Y]?
Install mysql server (y,n) [Y]?
Install apache (y,n) [Y]?
Install mongrel/mongrel_cluster (y,n) [Y]?
Reboot when done (y,n) [Y]?
Then sit back and watch as the script sets up everything for you and then reboots!
What the script does
- updates the core debian packages and sets up a crontab to do so once a week
- installs sshd just in case it's not there already
- installs some common packages all servers should have (compile tools, dnstools/traceroute, rsync, subversion, mysql client, etc)
- sets up the hostname
- installs a exim4 for sending mail
- sets the server's timezone to UTC and sets up a crontab to sync the time once a day
- optionally installs ruby, rubygems, irb, rails, imagemagick, and a few other common gems
- optionally installs mysql-server
- optionally installs apache and enables some common apache modules
- optionally installs and configures mongrel and mongrel_rails
- optionally reboots
Plans for the future
Without a doubt, I'll be making enhancements to this script over time as I try to automate repetitive tasks I find myself doing again and again. This is just my take on the "ideal Debian setup". Feel free to suggest any changes/enhancements.
Apache Tuning: MaxClients and Keep-Alive
May 14th, 2007
When a post about statisfy.net made it to digg's front page, we experienced an onslaught of traffic (shocker!). Within a few minutes of getting dugg, I noticed that Apache would sometimes disregard requests entirely. If I reloaded, it would work fine. This post goes through what I learned about how Apache handles requests and what I changed to allow more requests to be handled.
MaxClients
I started digging around on the server and found the Apache error log spitting out this message over and over:
[error] server reached MaxClients setting, consider raising the MaxClients setting
After some quick research (thanks google and Aaron from RHG), I learned a few things about Apache and how it handles requests. Out of the box, Apache is setup to have at most 16 child processes (ServerLimit). Each of these child processes will run with 25 threads (ThreadsPerChild). If you multiply these two numbers, you'll get that Apache should be able to handle 400 simultaneous requests... however Apache also has an out of the box limit of 150 simultaneous requests (MaxClients). At first, I changed just the MaxClients to 400.. but then I figured "more is better", right? So, I upped ServerLimit to 20 and set MaxClients to 500. My Apache config now had a section that looks like this:
<IfModule mpm_worker_module>
ServerLimit 20
StartServers 5
MaxClients 500
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
</IfModule>
After restarting Apache, the load average immediately doubled. My first reaction was that I broke something, but I went to statisfy.net, reloaded a few times, and (to my surprise) watched it load right away without any problems. Aaron suggested that "the load average doubled because you're now handling more requests... before you were just discarding them"... seems logical.
Keep-Alive optimization
After a while, I noticed that Apache would still hit the 500 MaxClients from time to time. I setup mod_status so I could watch what was actually happening at the Apache level and I noticed probably 70% of the threads were in a Keep-Alive state. This means that the server isn't actually processing a request but rather just waiting for another request from that user. I tried disabling Keep-Alive entirely and the simultaneous connections problem went away, but the browser experience was noticeably slower... so a better solution was needed. I tried playing around with the Keep-Alive timeout but I couldn't find a good balance between simultaneous connections and browser experience.
Statisfy has two distinct use cases for people requesting files from the server. The first use case is the user who is actually on the statisfy.net site watching statistics go by. The second use case is a user visiting a remote site with the statisfy.net javascript embedded and their browser hitting the statisfy.net server to record their statistics. In this second use case, there are (almost) always two requests made and no more beyond that. The first request is for the static file stats.js and the second request is for /stats/record?blah_blah_blah. I realized that if a client requests /stats/record, we can safely close their connection rather than doing Keep-Alive because chances are they won't be making another request for a little while.
To accomplish this, I enabled mod_headers and mod_setenvif and added the following settings to my Apache config:
SetEnvIf Request_URI "/stat/record" CloseConnectionAfterRequest
Header set Connection "close" env=CloseConnectionAfterRequest
I restarted Apache and watched mod_status for a while and noticed we went from close to 500 simultaneous connections down to about 100. Furthermore, the browser experience wasn't affected at all because Keep-Alive would be active for the browsers that were actually going to use it. Success!!!