You are here

MySQL VIEW as performance troublemaker

MySQL Performance Blog - Sun, 12/08/2007 - 6:17pm

I start to see applications being built utilizing VIEWs functionality which appeared in MySQL 5.0 and quite frequently VIEWs are used to help in writing the queries - to keep queries simple without really thinking how it affects server performance.

Even worse than that - looking at the short table which just gets single row from the table by the key we think this is simple query, while can be real monster instead with complexity hidden away in VIEW definition.

Just another day I worked on optimizing application which uses VIEWs and was looking at the long running query which just joined 2 tables... I ran EXPLAIN for it and got 200 of rows in the result set just for explain due to several layers of cascaded views built on top of one another so it is easy to write the queries, some of them it turn used subqueries subselects and derived tables.

It is also very dangerous if you assume MySQL would optimize your VIEWs same way as more advanced database systems would. Same as with subqueries and derived tables MySQL 5.0 will fail and perform very inefficiently in many counts.

MySQL has two ways of handling the VIEWS - query merge, in which case VIEW is simply expanded as a macro or Temporary Table in which case VIEW is materialized to temporary tables (without indexes !) which is later used further in query execution.
There does not seems to be any optimizations applied to the query used for temporary table creation from the outer query and plus if you use more then one Temporary Tables views which you join together you may have serious issues because such tables do not get any indexes.

Let me now show couple of examples.

Assume we have the comments table which holds users comments to the blog, naturally containing user_id which left comment, comment_id and comment text:

PLAIN TEXT SQL:
  1. CREATE TABLE `comments` (
  2.   `user_id` int(10) UNSIGNED NOT NULL,
  3.   `comment_id` int(10) UNSIGNED NOT NULL,
  4.   `message` text NOT NULL,
  5.   PRIMARY KEY (`user_id`,`comment_id`)
  6. ) ENGINE=MyISAM DEFAULT CHARSET=latin1

So how would you get number of comments left by the given user ?

PLAIN TEXT SQL:
  1. mysql> SELECT count(*) FROM comments WHERE user_id=5;
  2. +----------+
  3. | count(*) |
  4. +----------+
  5. |     1818 |
  6. +----------+
  7. 1 row IN SET (0.00 sec)

So how would we solve the same problem having things more modular and using MySQL VIEWs ?

PLAIN TEXT SQL:
  1. mysql> CREATE VIEW user_counts AS SELECT user_id,count(*) cnt FROM comments GROUP BY user_id;
  2. Query OK, 0 rows affected (0.00 sec)
  3.  
  4. mysql> SELECT * FROM user_counts WHERE user_id=5;
  5. +---------+------+
  6. | user_id | cnt  |
  7. +---------+------+
  8. |       5 | 1818 |
  9. +---------+------+
  10. 1 row IN SET (0.95 sec)

So we create the view which gives us back counts for each user and can simply query from that table restricting by user_id.
If this would be handled properly inside MySQL there would be even good reason to do that - so later you can change your application and convert user_count to summary table avoid changing any queries directly. Unfortunately it does not work.

It is interesting to see EXPLAIN for such query and time for the query which fetches everything from the VIEW - it is almost the same as getting only one row, and note even EXPLAIN takes same amount of time:

PLAIN TEXT SQL:
  1. mysql> EXPLAIN SELECT * FROM user_counts WHERE user_id=5 \G
  2. *************************** 1. row ***************************
  3.            id: 1
  4.   select_type: PRIMARY
  5.         TABLE: <derived2>
  6.          type: ALL
  7. possible_keys: NULL
  8.           KEY: NULL
  9.       key_len: NULL
  10.           ref: NULL
  11.          rows: 1001
  12.         Extra: USING WHERE
  13. *************************** 2. row ***************************
  14.            id: 2
  15.   select_type: DERIVED
  16.         TABLE: comments
  17.          type: INDEX
  18. possible_keys: NULL
  19.           KEY: PRIMARY
  20.       key_len: 8
  21.           ref: NULL
  22.          rows: 1792695
  23.         Extra: USING INDEX
  24. 2 rows IN SET (0.96 sec)
  25.  
  26.  
  27. mysql> SELECT * FROM user_counts;
  28. +---------+------+
  29. | user_id | cnt  |
  30. +---------+------+
  31. |       0 |  850 |
  32. |       1 | 1790 |
  33. |       2 | 1777 |
  34. |       3 | 1762 |
  35. |       4 | 1784 |
  36. ....
  37.  
  38. |     999 | 1808 |
  39. |    1000 |  898 |
  40. +---------+------+
  41. 1001 rows IN SET (0.96 sec)

So now lets create a very artificial query which will JOIN 2 views just to see how indexes are used:

PLAIN TEXT SQL:
  1. mysql> EXPLAIN SELECT uc.cnt+uc2.cnt FROM user_counts uc, user_counts uc2 WHERE uc.user_id=uc2.user_id AND uc.user_id=5 \G
  2. *************************** 1. row ***************************
  3.            id: 1
  4.   select_type: PRIMARY
  5.         TABLE: <derived2>
  6.          type: ALL
  7. possible_keys: NULL
  8.           KEY: NULL
  9.       key_len: NULL
  10.           ref: NULL
  11.          rows: 1001
  12.         Extra: USING WHERE; USING JOIN cache
  13. *************************** 2. row ***************************
  14.            id: 1
  15.   select_type: PRIMARY
  16.         TABLE: <derived3>
  17.          type: ALL
  18. possible_keys: NULL
  19.           KEY: NULL
  20.       key_len: NULL
  21.           ref: NULL
  22.          rows: 1001
  23.         Extra: USING WHERE
  24. *************************** 3. row ***************************
  25.            id: 3
  26.   select_type: DERIVED
  27.         TABLE: comments
  28.          type: INDEX
  29. possible_keys: NULL
  30.           KEY: PRIMARY
  31.       key_len: 8
  32.           ref: NULL
  33.          rows: 1792695
  34.         Extra: USING INDEX
  35. *************************** 4. row ***************************
  36.            id: 2
  37.   select_type: DERIVED
  38.         TABLE: comments
  39.          type: INDEX
  40. possible_keys: NULL
  41.           KEY: PRIMARY
  42.       key_len: 8
  43.           ref: NULL
  44.          rows: 1792695
  45.         Extra: USING INDEX
  46. 4 rows IN SET (1.91 sec)

As you can see we get 2 derived tables in which case which are fully populated and "full join" used to to join between them.
In this particular case it is not that bad because "join cache" is used to perform it relatively efficient, however for large derived tables it will become nightmare.

So be very careful implementing MySQL VIEWs in your application, especially ones which require temporary table execution method. VIEWs can be used with very small performance overhead but only in case they are used with caution.

MySQL has long way to go getting queries with VIEWs properly optimized.