Monday, May 11, 2009

Optimizing SQL: maximizing query cache

Today's topic is focused on getting the most out of MySQL's query cache.  Query cache isn't on by default, so if your servers aren't using it, you're missing out.

Let's look at how the query cache works.  Witness a query I've used before:
select translation_id from translations where string_id in
(select string_id from strings where file_id = 35) and language_id = 2;
126 rows in set (3.08 sec)

If I run the exact same query again, the performance improvement is dramatic:
126 rows in set (0.00 sec)

So the query cache maintains a list of queries and their resultset, but add a simple space to the query, or capitalize the 's' in Select, and to MySQL, it's a whole new query.  Furthermore, if the data is changed in any of the tables referenced by the query, the query and its resultset is cleared from cache and the query must be fully processed.

Don't be fooled by functions!


The following query, which returns changes that occurred yesterday and earlier, references the ever-changing value of NOW(), and will never be cached:
select translation_id from translations where string_id in
(select string_id from strings where file_id = 35) and created_on < now();

Instead, use a static value if you can:
select translation_id from translations where string_id in
(select string_id from strings where file_id = 35) and created_on < "2009-05-11";

Avoid costly updates


If your data changes so often that it's killing the effect of query cache, consider using separate tables for lookups (that get infrequent batched updates) and for updates.

As an alternative, if you're using a Master-Slave setup, configure your application to use the slave for SELECT queries and then configure low_priority_updates. A poor man's solution to preventing data from changing on your slave (to enhance query cache hits) is to cycle SLAVE STOP and SLAVE START commands with sleep intervals in between, so that data isn't constantly being written to.

At this stage, perhaps an even better (although more complex) solution would be to enable cache for  the application to avoid touching the database altogether (memcached on PHP, squid for Apache, and so on.)

Of course, all of these solutions have drawbacks, so use them at your discretion.

Pre-cache large queries


The Dash project's Project Dashboards use a large MySQL database on a very small server.  Once a week all kinds of information is gathered from CVS and SVN and all the tables are rebuilt.  While once-a-week updates are great for query cache, the first hits to any given page can take a long time to run.

The inexpensive solution was to incorporate a series of wget commands into the script that rebuilds the data, so that the queries on some of the frequently accessed pages can be cached immediately after the build, thus rendering instantly for everyone else.

Alternatively, wget's output pages could simply be saved to static html (such as is the case for the Bugzilla homepage, with its top-10 queries) for even greater gains.

MySQL's query cache is a simple and effective way to squeeze performance out of your database. It is by no means the solution to all your problems, so use accordingly.

Up next: simple text search

0 Comments:

Post a Comment

<< Home