Suppose you have two table, one of which is Big, e.g tableA has +150.000 rows, tableB has 4.000 rows. You want to efficiently find which rows share some identifier. Which way should you query for it?
explain select tableA.id from tableA left join tableB on tableA.id = tableB.id; +------+-------------+-------------+-------+---------------+---------+---------+------+-----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------------+-------+---------------+---------+---------+------+-----------+-------------+ | 1 | SIMPLE | tableA | index | NULL | PRIMARY | 8 | NULL | 154867659 | Using index | +------+-------------+-------------+-------+---------------+---------+---------+------+-----------+-------------+
create temporary table foo select tableA.id from tableA left join tableB on tableA.id = tableB.id;
Way A failed, it was so slow that I didn't have the time to let it finish.
Turns out that only Way B is effficient.
explain select tableA.id from tableB left join tableA on tableA.id = tableB.id; +------+-------------+-------------+--------+---------------+-------------+---------+------------------------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------------+--------+---------------+-------------+---------+------------------------+------+--------------------------+ | 1 | SIMPLE | tableB | index | NULL | preventdupe | 9 | NULL | 4233 | Using index | | 1 | SIMPLE | tableA | eq_ref | PRIMARY | PRIMARY | 8 | <database>.tableB.id | 1 | Using where; Using index | +------+-------------+-------------+--------+---------------+-------------+---------+------------------------+------+--------------------------+
create temporary table foo select tableB.id from tableB left join tableB on tableA.id = tableB.id; Query OK, 4233 rows affected (0.18 sec) Records: 4233 Duplicates: 0 Warnings: 0