You are here

To SQL_CALC_FOUND_ROWS or not to SQL_CALC_FOUND_ROWS?

MySQL Performance Blog - Tue, 28/08/2007 - 9:49pm

When we optimize clients' SQL queries I pretty often see a queries with SQL_CALC_FOUND_ROWS option used. Many people think, that it is faster to use this option than run two separate queries: one - to get a result set, another - to count total number of rows. In this post I'll try to check, is this true or not and when it is better to run two separate queries.

For my tests I've created following simple table:

PLAIN TEXT SQL:
  1. CREATE TABLE `count_test` (
  2.   `a` int(10) NOT NULL AUTO_INCREMENT,
  3.   `b` int(10) NOT NULL,
  4.   `c` int(10) NOT NULL,
  5.   `d` varchar(32) NOT NULL,
  6.   PRIMARY KEY  (`a`),
  7.   KEY `bc` (`b`,`c`)
  8. ) ENGINE=MyISAM

Test data has been created with following script (which creates 10M records):

PLAIN TEXT PHP:
  1. mysql_connect("127.0.0.1", "root");
  2. mysql_select_db("test");
  3.  
  4. for ($i = 0; $i <10000000; $i++) {
  5.     $b = $i % 1000;
  6.     mysql_query("INSERT INTO count_test SET b=$b, c=ROUND(RAND()*10), d=MD5($i)");
  7. }

First of all, let's try to perform some query on this table using indexed column b in where clause:

PLAIN TEXT SQL:
  1. mysql> SELECT SQL_NO_CACHE SQL_CALC_FOUND_ROWS * FROM count_test WHERE b = 555 ORDER BY c LIMIT 5;

Results with SQL_CALC_FOUND_ROWS are following: for each b value it takes 20-100 sec to execute uncached and 2-5 sec after warmup. Such difference could be explained by the I/O which required for this query - mysql accesses all 10k rows this query could produce without LIMIT clause.

Let's check, how long it'd take if we'll try to use two separate queries:

PLAIN TEXT SQL:
  1. mysql> SELECT SQL_NO_CACHE * FROM count_test WHERE b = 666 ORDER BY c LIMIT 5;

The results are following: it takes 0.01-0.11 sec to run this query first time and 0.00-0.02 sec for all consecutive runs.

And now - we need too check how long our COUNT query would take:

PLAIN TEXT SQL:
  1. mysql> SELECT SQL_NO_CACHE count(*) FROM count_test WHERE b = 666;

Result is really impressive here: 0.00-0.04 sec for all runs.

So, as we can see, total time for SELECT+COUNT (0.00-0.15 sec) is much less than execution time for original query (2-100 sec). Let's take a look at EXPLAINs:

PLAIN TEXT SQL:
  1. mysql> EXPLAIN SELECT SQL_CALC_FOUND_ROWS * FROM count_test WHERE b = 999 ORDER BY c LIMIT 5;
  2. +----+-------------+------------+------+---------------+------+---------+-------+-------+-------------+
  3. | id | select_type | TABLE      | type | possible_keys | KEY  | key_len | ref   | rows  | Extra       |
  4. +----+-------------+------------+------+---------------+------+---------+-------+-------+-------------+
  5. |  1 | SIMPLE      | count_test | ref  | bc            | bc   | 4       | const | 75327 | USING WHERE |
  6. +----+-------------+------------+------+---------------+------+---------+-------+-------+-------------+
  7. 1 row IN SET (0.00 sec)
  8.  
  9. mysql> EXPLAIN SELECT SQL_NO_CACHE count(*) FROM count_test WHERE b = 666;
  10. +----+-------------+------------+------+---------------+------+---------+-------+------+-------------+
  11. | id | select_type | TABLE      | type | possible_keys | KEY  | key_len | ref   | rows | Extra       |
  12. +----+-------------+------------+------+---------------+------+---------+-------+------+-------------+
  13. |  1 | SIMPLE      | count_test | ref  | bc            | bc   | 4       | const | 5479 | USING INDEX |
  14. +----+-------------+------------+------+---------------+------+---------+-------+------+-------------+
  15. 1 row IN SET (0.00 sec)

Here is why our count was much faster - MySQL accessed our table data when calculated result set size even when this was not needed (after the first 5 rows specified in LIMIT clause). With count(*) it used index scan inly which is much faster here.

Just to be objective I've tried to perform this test without indexes (full scan) and with index on b column. Results were following:

  1. Full-scan:
    • 7 seconds for SQL_CALC_FOUND_ROWS.
    • 7+7 seconds in case when two queries used.
  2. Filesort:
    • 1.8 seconds for SQL_CALC_FOUND_ROWS.
    • 1.8+0.05 seconds in case when two queries used.

So, obvious conclusion from this simple test is: when we have appropriate indexes for WHERE/ORDER clause in our query, it is much faster to use two separate queries instead of one with SQL_CALC_FOUND_ROWS.