You are here

MySQL 6.0 vs 5.1 in TPC-H queries

MySQL Performance Blog - Tue, 25/03/2008 - 9:34pm

Last week I played with queries from TPC-H benchmarks, particularly comparing MySQL 6.0.4-alpha with 5.1. MySQL 6.0 is interesting here, as there is a lot of new changes in optimizer, which should affect execution plan of TPC-H queries. In reality only two queries (from 22) have significantly better execution time (about them in next posts), but I want to write about is queries that execute slower in new MySQL 6.0 version.

Query is pretty simple

PLAIN TEXT SQL:
  1. SELECT sum(l_extendedprice * l_discount) AS revenue
  2. FROM lineitem WHERE l_shipdate>= date '1995-01-01'
  3. AND l_shipdate <date '1995-01-01' + interval '1' year
  4. AND l_discount BETWEEN 0.09 - 0.01 AND 0.09 + 0.01
  5. AND l_quantity <24;

with execution plain (in 5.1)

PLAIN TEXT SQL:
  1. +----+-------------+----------+-------+---------------+---------------+---------+------+---------+-------------+
  2. | id | select_type | TABLE    | type  | possible_keys | KEY           | key_len | ref  | rows    | Extra       |
  3. +----+-------------+----------+-------+---------------+---------------+---------+------+---------+-------------+
  4. |  1 | SIMPLE      | lineitem | range | li_shp_dt_idx | li_shp_dt_idx | 3       | NULL | 8272067 | USING WHERE |
  5. +----+-------------+----------+-------+---------------+---------------+---------+------+---------+-------------+

against table

PLAIN TEXT SQL:
  1. CREATE TABLE `lineitem` (
  2.   `l_shipdate` date NOT NULL,
  3.   `l_orderkey` int(11) NOT NULL,
  4.   `l_partkey` int(11) NOT NULL,
  5.   `l_suppkey` int(11) NOT NULL,
  6.   `l_linenumber` int(11) NOT NULL,
  7.   `l_quantity` decimal(15,2) NOT NULL,
  8.   `l_extendedprice` decimal(15,2) NOT NULL,
  9.   `l_discount` decimal(15,2) NOT NULL,
  10.   `l_tax` decimal(15,2) NOT NULL,
  11.   `l_returnflag` char(1) NOT NULL,
  12.   `l_linestatus` char(1) NOT NULL,
  13.   `l_commitdate` date NOT NULL,
  14.   `l_receiptdate` date NOT NULL,
  15.   `l_shipinstruct` char(25) NOT NULL,
  16.   `l_comment` varchar(44) NOT NULL,
  17.   `l_shipmode` char(10) NOT NULL,
  18.   PRIMARY KEY (`l_orderkey`,`l_linenumber`),
  19.   KEY `lineitem_fk2` (`l_suppkey`),
  20.   KEY `lineitem_fk3` (`l_partkey`,`l_suppkey`),
  21.   KEY `li_shp_dt_idx` (`l_shipdate`),
  22.   KEY `li_com_dt_idx` (`l_commitdate`),
  23.   KEY `li_rcpt_dt_idx` (`l_receiptdate`)
  24. ) ENGINE=MyISAM DEFAULT CHARSET=latin1
  25.  
  26. WITH 59986052 rows

So this query takes 32.27 sec to execute in 5.1, and 39.39 sec in 6.0 (after warpm-ups and fully cached in key_cache)

What is difference in 6.0? Here is execution plain for 6.0:

PLAIN TEXT SQL:
  1. +----+-------------+----------+-------+---------------+---------------+---------+------+---------+-----------------------------------------------+
  2. | id | select_type | TABLE    | type  | possible_keys | KEY           | key_len | ref  | rows    | Extra                                         |
  3. +----+-------------+----------+-------+---------------+---------------+---------+------+---------+-----------------------------------------------+
  4. |  1 | SIMPLE      | lineitem | range | li_shp_dt_idx | li_shp_dt_idx | 3       | NULL | 8272067 | USING INDEX condition; USING WHERE; USING MRR |
  5. +----+-------------+----------+-------+---------------+---------------+---------+------+---------+-----------------------------------------------+

See there are new Extra notes
USING INDEX condition; USING MRR.

It would be interesting to know what is 'Using Index condition', what I was able to find is Baron's bug with no mention in MySQL 6.0 manual. No mentions about Using MRR also, but I guess it is Multi-Range-Read feature which allows to read data not row-by-row but in batches reads.

So let us see how perform 5.1 and 6.0 in cold mode (I dropped OS caches by `echo 1 > /proc/sys/vm/drop_caches` and restarting mysqld)

5.1 - time 1 hour 14 min 36.90 sec
6.0 - time 1 hour 6 min 2.21 sec

So difference is not so big we could speak about big improvements with using MRR here.

What is interesting is iostat -dx 5 output during query execution.

For 5.1 I had:

PLAIN TEXT SQL:
  1. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  2. sda          0.00   2.06 310.70  0.82 2493.83   23.05  1246.91    11.52     8.08     1.02    3.28   3.28 102.04
  3. dm-0         0.00   0.00 310.70  0.62 2493.83    4.94  1246.91     2.47     8.03     1.02    3.29   3.28 102.18
  4.  
  5. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  6. sda          0.00   0.21 301.23  0.41 2441.07    4.93  1220.53     2.46     8.11     1.02    3.39   3.39 102.14
  7. dm-0         0.00   0.00 301.23  0.62 2441.07    4.93  1220.53     2.46     8.10     1.02    3.39   3.39 102.18
  8.  
  9. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  10. sda          0.00   0.41 317.83  0.82 2570.49    9.84  1285.25     4.92     8.10     1.02    3.20   3.20 102.01
  11. dm-0         0.00   0.00 317.83  0.61 2570.49    4.92  1285.25     2.46     8.09     1.02    3.21   3.21 102.11

almost all time. As you see 1285 rkB/s is pretty small for RAID 10 system we have on this server. This is because of Random I/O we had during execution of query, as MySQL used index to scan rows and then did access to data to read other columns.

But for 6 picture is a bit different. At start I had:

PLAIN TEXT SQL:
  1. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  2. sda          0.00   0.21 1052.67  0.41 8455.97    4.94  4227.98     2.47     8.03     1.01    0.96   0.96 101.44
  3. dm-0         0.00   0.00 1052.67  0.62 8455.97    4.94  4227.98     2.47     8.03     1.02    0.96   0.96 101.50
  4.  
  5. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  6. sda          0.00   0.21 1091.94  3.10 8763.64   26.45  4381.82    13.22     8.03     1.02    0.94   0.93 102.13
  7. dm-0         0.00   0.00 1091.53  0.62 8760.33    4.96  4380.17     2.48     8.03     1.02    0.94   0.94 102.25
  8.  
  9. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  10. sda          0.00   0.62 944.97  0.82 7582.75   11.50  3791.38     5.75     8.03     1.01    1.07   1.07 101.40
  11. dm-0         0.00   0.00 944.97  0.62 7582.75    4.93  3791.38     2.46     8.02     1.01    1.07   1.07 101.40

and then rkB/s was continuously dropped to

PLAIN TEXT SQL:
  1. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  2. sda          0.00   0.00 341.80  0.00 2739.34    0.00  1369.67     0.00     8.01     1.02    2.97   2.97 101.56
  3. dm-0         0.00   0.00 341.80  0.00 2739.34    0.00  1369.67     0.00     8.01     1.02    2.97   2.97 101.56
  4.  
  5. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  6. sda          0.00   0.61 309.63  0.82 2488.52   11.48  1244.26     5.74     8.05     1.02    3.28   3.28 101.84
  7. dm-0         0.00   0.00 309.63  0.61 2488.52    4.92  1244.26     2.46     8.04     1.02    3.28   3.28 101.78
  8.  
  9. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  10. sda          0.00   0.21 318.76  0.41 2558.35    4.95  1279.18     2.47     8.03     1.02    3.21   3.21 102.43
  11. dm-0         0.00   0.00 318.76  0.62 2558.35    4.95  1279.18     2.47     8.03     1.02    3.21   3.21 102.49

at the end of query execution. I guess we had 4300 rkB/s because of MRR, but I have no idea why this number was dropped to 1200 finally. Probably we need to check how really MRR works.

What else worth to mention here is that MySQL is wrong choosing index scan here. If we try to execute query with IGNORE KEY (li_shp_dt_idx) (full table scan will be performed in this case) then we got:
1 min 39.14 sec to execute query (compare with 1 hour 14 min 36.90 sec).

and iostat here is:

PLAIN TEXT SQL:
  1. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  2. sda         16.22   1.23 534.29  1.85 132547.02   24.64 66273.51    12.32   247.27     0.30    0.55   0.55  29.67
  3. dm-0         0.00   0.00 550.10  0.62 132545.38    4.93 66272.69     2.46   240.69     0.30    0.54   0.53  29.30
  4.  
  5. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  6. sda         15.78   0.20 524.39  0.41 130224.59    4.92 65112.30     2.46   248.15     0.30    0.56   0.56  29.61
  7. dm-0         0.00   0.00 540.37  0.61 130224.59    4.92 65112.30     2.46   240.73     0.30    0.56   0.55  29.49
  8.  
  9. Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
  10. sda         16.02   0.20 526.77  1.01 130724.54    9.74 65362.27     4.87   247.70     0.27    0.52   0.52  27.22
  11. dm-0         0.00   0.00 542.60  0.61 130724.54    4.87 65362.27     2.43   240.66     0.28    0.51   0.50  27.08

As you see we have sequential I/O with 65362 rkB/s (compare 1200 rkB/s with random I/O), that allows to execute query ~40 times faster.

Entry posted by Vadim | One comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks