You are here

How much overhead is caused by on disk temporary tables

MySQL Performance Blog - Thu, 16/08/2007 - 10:19pm

As you might know while running GROUP BY and some other kinds of queries MySQL needs to create temporary tables, which can be created in memory, using MEMORY storage engine or can be created on disk as MYISAM tables. Which one will be used depends on the allowed tmp_table_size and also by the data which needs to be put to temporary tables - BLOB/TEXT columns are not supported with MEMORY storage engine so must use on disk MyISAM temporary table.

What i decided to do is to see how much overhead do on disk temporary tables cause compared to MEMORY tables. To have things comparable I used medium size table and types which can be presented both in MEMORY and MyISAM storage engine.

The benchmarks are done with MySQL 5.0.45 on 32bit Fedora Core 6 on Pentium 4 box - something I had available for testing.

PLAIN TEXT SQL:
  1. CREATE TABLE `gt` (
  2.   `i` int(10) UNSIGNED NOT NULL,
  3.   `c` char(50) NOT NULL
  4. ) ENGINE=MyISAM DEFAULT CHARSET=latin1
  5.  
  6.  
  7. mysql> SELECT * FROM gt LIMIT 5;
  8. +-------+------------------------------------------+
  9. | i     | c                                        |
  10. +-------+------------------------------------------+
  11. | 25451 | 80c9a87595687ccb33fa525d396ee75658aec777 |
  12. | 48063 | 05bdcc2c1bb051cff861cee1ffcb7f680922b13e |
  13. | 84146 | afdf7dbdf7a71dbeac553a0cccce4ab3b1ca3bdb |
  14. | 66197 | a26f4ce41df9b79f8ce73f8a41c207c4b67f827d |
  15. |   432 | 5f912d13f2d415c2c7cb325e1ebbdf468b37bc91 |
  16. +-------+------------------------------------------+
  17. 5 rows IN SET (0.00 sec)

The table contains 1.000.000 rows with mostly unique "c" column.

PLAIN TEXT SQL:
  1. mysql> EXPLAIN SELECT count(*) cnt,c FROM gt GROUP BY c ORDER BY NULL LIMIT 10;
  2. +----+-------------+-------+------+---------------+------+---------+------+---------+-----------------+
  3. | id | select_type | TABLE | type | possible_keys | KEY  | key_len | ref  | rows    | Extra        |
  4. +----+-------------+-------+------+---------------+------+---------+------+---------+-----------------+
  5. |  1 | SIMPLE      | gt    | ALL  | NULL          | NULL | NULL    | NULL | 1000000 | USING TEMPORARY |
  6. +----+-------------+-------+------+---------------+------+---------+------+---------+-----------------+
  7. 1 row IN SET (0.01 sec)

As you can see I'm using ORDER BY NULL clause as otherwise MySQL will sort the data after performing group by which is overhead we're not looking to measure. This is actually good trick to use for cases when you do not need GROUP BY results sorted - it can significantly improve performance in some cases.

First I run the query with default key_buffer_size which is not large enough to fit all key blocks from temporary table index, so we get a lot of key writes which kill performance.

The problem with writes is OS can delay writes only for fixed amount of time and when it has to perform them anyway which can cause IO bound load even with small data which can fully fit in OS cache as in this case. In perfect world it should not happen and we should have seen 50MB sequentially written once per certain amount of time which takes fraction of second to do, but it is not how it happens in practice.

PLAIN TEXT SQL:
  1. mysql> SHOW global STATUS LIKE "key%";
  2. +------------------------+---------+
  3. | Variable_name          | Value   |
  4. +------------------------+---------+
  5. | Key_blocks_not_flushed | 6516    |
  6. | Key_blocks_unused      | 0       |
  7. | Key_blocks_used        | 7248    |
  8. | Key_read_requests      | 2051451 |
  9. | Key_reads              | 198208  |
  10. | Key_write_requests     | 469220  |
  11. | Key_writes             | 185808  |
  12. +------------------------+---------+
  13. 7 rows IN SET (0.00 sec)

So how long does it take:

PLAIN TEXT SQL:
  1. mysql> SELECT count(*) cnt,c FROM gt GROUP BY c ORDER BY NULL LIMIT 10;
  2. +-----+------------------------------------------+
  3. | cnt | c                                        |
  4. +-----+------------------------------------------+
  5. |   1 | 80c9a87595687ccb33fa525d396ee75658aec777 |
  6. |   1 | 05bdcc2c1bb051cff861cee1ffcb7f680922b13e |
  7. |   1 | afdf7dbdf7a71dbeac553a0cccce4ab3b1ca3bdb |
  8. |   1 | a26f4ce41df9b79f8ce73f8a41c207c4b67f827d |
  9. |   1 | 5f912d13f2d415c2c7cb325e1ebbdf468b37bc91 |
  10. |   1 | dc960ee027a9d741ccd22ae0ee32d0c8915f158b |
  11. |   1 | 68d91f476ff40116c9c9ba0b85037496ededa1ef |
  12. |   2 | 9bc66fb6426222e3a94c43a64ed701cf197f8cb8 |
  13. |   1 | 9c2a671310c810e1f4a9aa4763ce5e76d28e3af4 |
  14. |   1 | 2e282a3aeb832b8b5861f703041acba5a4285a35 |
  15. +-----+------------------------------------------+
  16. 10 rows IN SET (9 min 5.40 sec)

I repeated the run a few times with very similar results.

OK so what if we set key buffer to 128M ?

PLAIN TEXT SQL:
  1. mysql> SHOW global STATUS LIKE "key%";
  2. +------------------------+---------+
  3. | Variable_name          | Value   |
  4. +------------------------+---------+
  5. | Key_blocks_not_flushed | 49361   |
  6. | Key_blocks_unused      | 61201   |
  7. | Key_blocks_used        | 49361   |
  8. | Key_read_requests      | 4007329 |
  9. | Key_reads              | 49361   |
  10. | Key_write_requests     | 873177  |
  11. | Key_writes             | 0       |
  12. +------------------------+---------+
  13. 7 rows IN SET (0.00 sec)

As you can see we have no writes because data fits in key buffer. Note however there is still significant amount of reads (the stats were flushed between query runs) which does not make sense as there no data to be read from the disk as nothing was written. So I filled bug on this.

So what was performance

PLAIN TEXT SQL:
  1. mysql> SELECT count(*) cnt,c FROM gt GROUP BY c ORDER BY NULL LIMIT 10;
  2. +-----+------------------------------------------+
  3. | cnt | c                                        |
  4. +-----+------------------------------------------+
  5. |   1 | 80c9a87595687ccb33fa525d396ee75658aec777 |
  6. |   1 | 05bdcc2c1bb051cff861cee1ffcb7f680922b13e |
  7. |   1 | afdf7dbdf7a71dbeac553a0cccce4ab3b1ca3bdb |
  8. |   1 | a26f4ce41df9b79f8ce73f8a41c207c4b67f827d |
  9. |   1 | 5f912d13f2d415c2c7cb325e1ebbdf468b37bc91 |
  10. |   1 | dc960ee027a9d741ccd22ae0ee32d0c8915f158b |
  11. |   1 | 68d91f476ff40116c9c9ba0b85037496ededa1ef |
  12. |   2 | 9bc66fb6426222e3a94c43a64ed701cf197f8cb8 |
  13. |   1 | 9c2a671310c810e1f4a9aa4763ce5e76d28e3af4 |
  14. |   1 | 2e282a3aeb832b8b5861f703041acba5a4285a35 |
  15. +-----+------------------------------------------+
  16. 10 rows IN SET (34.47 sec)

Wow this is over 15 times better so you surely need to make sure your key_buffer_size is set to accommodate needs of your temporary table. Do not forget about it especially if running only Innodb tables so you may think there is no use for it.

Looking at this number I found it to be too good to be true as I've seen significantly worse performance for large data sets. So I decided to see what happens if I create a larger table - 4M rows. I sized key_buffer to 256M so it would fit everything needed and there was still enough memory for OS cache to keep temporary table. Results:

PLAIN TEXT SQL:
  1. mysql> SELECT count(*) cnt,c FROM gtest GROUP BY c ORDER BY NULL LIMIT 10;
  2. +-----+------------------------------------------+
  3. | cnt | c                                        |
  4. +-----+------------------------------------------+
  5. |   2 | 80c9a87595687ccb33fa525d396ee75658aec777 |
  6. |   1 | 05bdcc2c1bb051cff861cee1ffcb7f680922b13e |
  7. |   1 | afdf7dbdf7a71dbeac553a0cccce4ab3b1ca3bdb |
  8. |   2 | a26f4ce41df9b79f8ce73f8a41c207c4b67f827d |
  9. |   1 | 5f912d13f2d415c2c7cb325e1ebbdf468b37bc91 |
  10. |   1 | dc960ee027a9d741ccd22ae0ee32d0c8915f158b |
  11. |   1 | 68d91f476ff40116c9c9ba0b85037496ededa1ef |
  12. |   2 | 9bc66fb6426222e3a94c43a64ed701cf197f8cb8 |
  13. |   1 | 9c2a671310c810e1f4a9aa4763ce5e76d28e3af4 |
  14. |   1 | 2e282a3aeb832b8b5861f703041acba5a4285a35 |
  15. +-----+------------------------------------------+
  16. 10 rows IN SET (27 min 5.51 sec)

VMSTAT:

PLAIN TEXT SQL:
  1. procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
  2.  r  b   swpd   free   buff  cache   si   so    bi    bo   IN    cs us sy id wa st
  3.  0  4    116  60064   9084 547160    0    0     5    44   96   103  2  0 94  3  0
  4.  0  3    116  56248   9104 549212    0    0     2  2023 1518   498  5  2 10 83  0
  5.  1  3    116  56316   9104 549228    0    0     0  2121 1539   497  0  1 24 75  0

In this case as you can see there is a lot of dirty blocks flushing going on with pretty random IO. Clearly OS write caching is not effective for relatively large area (which still fits in memory) which is getting a lot of random writes.

OK. Now lets move to testing MEMORY tables for the same queries:

PLAIN TEXT SQL:
  1. mysql> SET global key_buffer_size=8000000;
  2. Query OK, 0 rows affected (0.03 sec)
  3.  
  4. mysql> SET global max_heap_table_size=1000000000;
  5. Query OK, 0 rows affected (0.00 sec)
  6.  
  7. mysql> SET global tmp_table_size=1000000000;
  8. Query OK, 0 rows affected (0.00 sec)

Ahh... I alway forget GLOBAL only sets global value but it does not affect current session. A little gotcha which still catches me.

For 1M rows we have

PLAIN TEXT SQL:
  1. mysql> SET max_heap_table_size=1000000000;
  2. Query OK, 0 rows affected (0.00 sec)
  3.  
  4. mysql> SET tmp_table_size=1000000000;
  5. Query OK, 0 rows affected (0.00 sec)
  6.  
  7.  
  8. mysql> SELECT count(*) cnt,c FROM gt GROUP BY c ORDER BY NULL LIMIT 10;
  9. +-----+------------------------------------------+
  10. | cnt | c                                        |
  11. +-----+------------------------------------------+
  12. |   1 | 80c9a87595687ccb33fa525d396ee75658aec777 |
  13. |   1 | 05bdcc2c1bb051cff861cee1ffcb7f680922b13e |
  14. |   1 | afdf7dbdf7a71dbeac553a0cccce4ab3b1ca3bdb |
  15. |   1 | a26f4ce41df9b79f8ce73f8a41c207c4b67f827d |
  16. |   1 | 5f912d13f2d415c2c7cb325e1ebbdf468b37bc91 |
  17. |   1 | dc960ee027a9d741ccd22ae0ee32d0c8915f158b |
  18. |   1 | 68d91f476ff40116c9c9ba0b85037496ededa1ef |
  19. |   2 | 9bc66fb6426222e3a94c43a64ed701cf197f8cb8 |
  20. |   1 | 9c2a671310c810e1f4a9aa4763ce5e76d28e3af4 |
  21. |   1 | 2e282a3aeb832b8b5861f703041acba5a4285a35 |
  22. +-----+------------------------------------------+
  23. 10 rows IN SET ( 3.88 sec)

Wow. This is great difference even from our best MyISAM results.

For 4M of rows

PLAIN TEXT SQL:
  1. mysql> SELECT count(*) cnt,c FROM gtest  GROUP BY c ORDER BY NULL LIMIT 10;
  2. +-----+------------------------------------------+
  3. | cnt | c                                        |
  4. +-----+------------------------------------------+
  5. |   2 | 80c9a87595687ccb33fa525d396ee75658aec777 |
  6. |   1 | 05bdcc2c1bb051cff861cee1ffcb7f680922b13e |
  7. |   1 | afdf7dbdf7a71dbeac553a0cccce4ab3b1ca3bdb |
  8. |   2 | a26f4ce41df9b79f8ce73f8a41c207c4b67f827d |
  9. |   1 | 5f912d13f2d415c2c7cb325e1ebbdf468b37bc91 |
  10. |   1 | dc960ee027a9d741ccd22ae0ee32d0c8915f158b |
  11. |   1 | 68d91f476ff40116c9c9ba0b85037496ededa1ef |
  12. |   2 | 9bc66fb6426222e3a94c43a64ed701cf197f8cb8 |
  13. |   1 | 9c2a671310c810e1f4a9aa4763ce5e76d28e3af4 |
  14. |   1 | 2e282a3aeb832b8b5861f703041acba5a4285a35 |
  15. +-----+------------------------------------------+
  16. 10 rows IN SET ( 16.41 sec)

So MEMORY table scales pretty well with query execution time being close to linear dependence of table size, and it can get 100 times faster than MyISAM on disk table even in case temporary table is small enough to fit in OS cache and key_buffer.

My next intension was to test placing table on tmpfs as this should avoid write overhead we've observing (and which is indeed very good production practice, if your temporary tables are moderately sized).

Unfortunately I ran into another bug which means I will have to postpone this part of test to another time.

Summary:
If this was too many details for you to read through here is the summary:

  • key_buffer_size is important for MyISAM temporary tables performance to avoid OS writes
  • OS Write cache is not as efficient as I would hope, at least on Linux
  • Performance of Disk MyISAM tables may not scale proportionally with table size, even when all data fits in memory
  • MEMORY temporary tables can be 10-100 times faster than disk based MyISAM tables

Hopefully MySQL will implement support of dynamic rows for MEMORY tables some time in the future and so we will be able to use MEMORY table for all cases when data set fits in memory as it is really worth it.