Efficient outer join in mysql

Suppose you have two table, one of which is Big, e.g tableA has +150.000 rows, tableB has 4.000 rows. You want to efficiently find which rows share some identifier. Which way should you query for it?

Way A, big first

explain select tableA.id from tableA left join tableB on tableA.id = tableB.id;
+------+-------------+-------------+-------+---------------+---------+---------+------+-----------+-------------+
| id   | select_type | table       | type  | possible_keys | key     | key_len | ref  | rows      | Extra       |
+------+-------------+-------------+-------+---------------+---------+---------+------+-----------+-------------+
|    1 | SIMPLE      | tableA      | index | NULL          | PRIMARY | 8       | NULL | 154867659 | Using index |
+------+-------------+-------------+-------+---------------+---------+---------+------+-----------+-------------+

create temporary table foo select tableA.id from tableA left join tableB on tableA.id = tableB.id;

Way A failed, it was so slow that I didn't have the time to let it finish.

Way B, small first

Turns out that only Way B is effficient.

explain select tableA.id from tableB left join tableA on tableA.id = tableB.id;
+------+-------------+-------------+--------+---------------+-------------+---------+------------------------+------+--------------------------+
| id   | select_type | table       | type   | possible_keys | key         | key_len | ref                    | rows | Extra                    |
+------+-------------+-------------+--------+---------------+-------------+---------+------------------------+------+--------------------------+
|    1 | SIMPLE      | tableB      | index  | NULL          | preventdupe | 9       | NULL                   | 4233 | Using index              |
|    1 | SIMPLE      | tableA      | eq_ref | PRIMARY       | PRIMARY     | 8       | <database>.tableB.id   |    1 | Using where; Using index |
+------+-------------+-------------+--------+---------------+-------------+---------+------------------------+------+--------------------------+

create temporary table foo select tableB.id from tableB left join tableB on tableA.id = tableB.id;
Query OK, 4233 rows affected (0.18 sec)
Records: 4233  Duplicates: 0  Warnings: 0

comments powered by Disqus

Back to the index

Blog roll

R-bloggers, Debian Weekly

Last modified: juli 31, 2020