Efficient outer join in mysql

Suppose you have two table, one of which is Big, e.g tableA has +150.000 rows, tableB has 4.000 rows. You want to efficiently find which rows share some identifier. Which way should you query for it?

Way A, big first

explain select tableA.id from tableA left join tableB on tableA.id = tableB.id;
+------+-------------+-------------+-------+---------------+---------+---------+------+-----------+-------------+
| id   | select_type | table       | type  | possible_keys | key     | key_len | ref  | rows      | Extra       |
+------+-------------+-------------+-------+---------------+---------+---------+------+-----------+-------------+
|    1 | SIMPLE      | tableA      | index | NULL          | PRIMARY | 8       | NULL | 154867659 | Using index |
+------+-------------+-------------+-------+---------------+---------+---------+------+-----------+-------------+
create temporary table foo select tableA.id from tableA left join tableB on tableA.id = tableB.id;

Way A failed, it was so slow that I didn't have the time to let it finish.

Way B, small first

Turns out that only Way B is effficient.

explain select tableA.id from tableB left join tableA on tableA.id = tableB.id;
+------+-------------+-------------+--------+---------------+-------------+---------+------------------------+------+--------------------------+
| id   | select_type | table       | type   | possible_keys | key         | key_len | ref                    | rows | Extra                    |
+------+-------------+-------------+--------+---------------+-------------+---------+------------------------+------+--------------------------+
|    1 | SIMPLE      | tableB      | index  | NULL          | preventdupe | 9       | NULL                   | 4233 | Using index              |
|    1 | SIMPLE      | tableA      | eq_ref | PRIMARY       | PRIMARY     | 8       | <database>.tableB.id   |    1 | Using where; Using index |
+------+-------------+-------------+--------+---------------+-------------+---------+------------------------+------+--------------------------+
create temporary table foo select tableB.id from tableB left join tableB on tableA.id = tableB.id;
Query OK, 4233 rows affected (0.18 sec)
Records: 4233  Duplicates: 0  Warnings: 0

comments powered by Disqus


Back to the index

Blog roll

R-bloggers, Debian Weekly
Valid XHTML 1.0 Strict [Valid RSS] Valid CSS! Emacs Muse Last modified: juli 31, 2020