Thursday, April 10, 2008

Tuning busy Linux boxes

Anyone with a bit of skill can put together a fast Linux box for serving files and databases. Hook up thousands of users accessing 500G of data with hundreds of SQL queries/second and you have a challenge.

You don't need fancy tools for finding bottlenecks, as they typically occur in four areas: CPU, RAM, Disk and Network.

Here's an sample of 'top' taken from our MySQL and NFS master. This busy box is a quad-cpu IBM POWER 5 box with 16G RAM, and two RAID-5 arrays (one 8-disk 15K RPM and one 7-disk 10K RPM) on separate RAID controllers:

(There is a picture here)

Although this box is currently very fast and responsive, it's good to identify the first bottleneck before it becomes a problem. Let's investigate:

1. A load average of 8.18 is not necessarily bad, considering the box has eight processor units (through the magic of SMT), but high load doesn't necessarily mean your CPUs are slow.

2. An average of about 62% of CPU time is waiting for I/O. This is not good, and it's our biggest source of bottleneck. Unlike idle time, a high 'wa' value means the CPU cannot do anything else than wait, because it needs the I/O to continue.

3. 16G RAM with 16G used is not necessarily a sign of low RAM. A busy Linux box *should* use all its RAM for disk cache and buffers. On this box, 3000252 (3G) of RAM is caching files from disk.

4. The MySQL process is taking up a whopping 6.1G of RAM. It may be a bit too aggressive for this box, as some of that RAM could go towards disk cache.

5. Of the 8 NFS daemons, only one is able to Run (R), while the others are blocked by IO wait (D). These are the source of our high I/O wait time from item 2. MySQL and the LDAP server are happily Sleeping (S), waiting for work.

One could conclude that I need to get faster disks (or more of them), but in fact it needs RAM, and lots of it. It's spending too much time going to disk for files when they could be cached, and at 3G the disk cache is not nearly enough for 500G of actively used data. An alternative to more RAM would be to free that 6G occupied by MySQL by moving it to another box.

An extra 16G of RAM (32G total) would go a long way in decreasing disk I/O. Having a total of 56G of RAM would be ideal -- 50G of disk cache (10% of the active data size) and 6G for the monstrous MySQL process.


Anonymous Michael Scharf said...

Usually it is seek time that kills the performance of disks. Maybe a big fast flash drive could help.....

6:23 PM  

Post a Comment

<< Home