Thursday, October 09, 2008

Linux servers: swap, memory and how much of each

When you set up a Linux server, you need to carefully consider the amount of RAM to buy and swap space to use. Back in the old days, folks would simply suggest setting the swap amount to twice the amount of RAM. Nowadays, as RAM is cheap, I see some people suggesting to not use swap at all. Here's my take on it. Let's use this top output from one node of our 3-node web cluster.



1. This box has 3.3G (let's just round the numbers for clarity) of available RAM. Let's find out if it is enough.

2. 2G of swap space is allocated, which is less than 1x RAM. As a general rule, I try to not allocate more than 2G of swap space for file servers *. I've seen (and still occasionally see) some of our servers that simply swap themselves to death (requiring a hard reset) as the disks become overwhelmed and the server unresponsive. The Kernel OOM killer does a fantastic job of keeping the box alive if it runs out of memory, so let it run out. If it's swapping constantly, there's a problem anyway, and more swap space isn't the solution.

3. About 2G RAM currently used. High RAM usage is not a bad thing, as the OS uses RAM for file cache for increased performance. Actually, unused RAM is wasted RAM in my book.

4. About 300M of SWAP currently used. Again, a high number is not a bad thing -- as long as it's not actively swapping (see below). Most decent OSes are smart enough to swap idle memory pages to disk if they're idle/sleeping for a certain period of time. This is a good thing, and a reason why some swap is good -- it frees sleeping RAM for active file cache duty.

5. About 880M of free RAM. In our case, we use the Apache prefork MPM, so some free memory is required to handle traffic spikes, where additional Apache processes would be spawned. However, I'd prefer this 880M go to the file cache -- if a traffic spike occurs, the memory manager will simply 'take back' memory from the file cache and allocate it to Apache. File cache memory is not committed (and lost) permanently.

6. About 74M of file buffers. These web servers don't write to disk much (except for Apache logs).

7. About 1.4G of file cache. Our entire web space is about 6G on disk, so it's caching a very large portion (25%) in RAM, which likely represents the most active web pages. In fact, if more web files were used actively, Linux would commit that 880M of free RAM to file cache, but in this case, it doesn't need to. The more active data you can get into file cache, the less often the server needs to go to disk.

8. 0.3% of CPU time is spent waiting for I/O (disk, network, etc). This is a good indicator that we're not actively swapping to disk, nor are we reading much web content from disk.


How do I know if my server is actively swapping?



Use the vmstat tool, and look at the si and so (swap in and swap out) numbers. vmstat 5 will report stats every 5 seconds until interrupted, so just sit there and watch it go. If si/so are greater than zero for extended periods of time, then your server is actively swaping pages of RAM to/from disk. In the very short 2-second sample above, we can see the following:

1. Despite having 302196 bytes in swap, 0 bytes are being traded between swap and RAM now. That means the swap contains inactive, sleeping data, and that's what we want it for.

2. bi and bo represent disk I/O: Blocks in (write to disk) and Blocks out (read from disk). Normally this is expressed in 1K increments, so you can see here that this server's disks are basically idle, reading 14K and writing 1K in the first sample, and zero disk activity 2 seconds later. Obviously, high si and so will also lead to high bi and bo.

Conclusion

Do set some swap space on your Linux Server (and Desktop), even if it's just 256M. Don't allocate 16G of swap for a basic file/web/mail server *, otherwise it may end up swapping itself to death. Let the OOM killer do its job, and add RAM if you're swapping to disk regularly. Large Application servers may benefit from more swap, as inactive/sleeping applications can be paged to disk to free up RAM for file cache or other applications.


* I consider an HTML/PHP webserver such as www.eclipse.org to be nothing more than a file server in the context of RAM/swap/disk I/O.


Edited: 4:33pm ET. Nathan pointed out a typo (who knew he could spell!)

1 Comments:

Anonymous Matthew Mastracci said...

One strange consequence of having no swap is that process execution from Java will randomly fail with out of memory errors. We fought that one for a while until adding a minimal amount of swap after which it went away.

Very strange.

11:44 AM  

Post a Comment

<< Home