Is there a performance difference between BETWEEN and IN with MySQL or in SQL in general?
Solution 1
BETWEEN
should outperform IN
in this case (but do measure and check execution plans, too!), especially as n
grows and as statistics are still accurate. Let's assume:
-
m
is the size of your table -
n
is the size of your range
Index can be used (n
is tiny compared to m
)
In theory,
BETWEEN
can be implemented with a single "range scan" (Oracle speak) on the primary key index, and then traverse at mostn
index leaf nodes. The complexity will beO(n + log m)
IN
is usually implemented as a series (loop) ofn
"range scans" on the primary key index. Withm
being the size of the table, the complexity will always beO(n * log m)
... which is always worse (neglibile for very small tablesm
or very small rangesn
)
Index cannot be used (n
is a significant portion of m
)
In any case, you'll get a full table scan and evaluate the predicate on each row:
BETWEEN
needs to evaluate two predicates: One for the lower and one for the upper bound. The complexity isO(m)
IN
needs to evaluate at mostn
predicates. The complexity isO(m * n)
... which is again always worse, or perhapsO(m)
if the database can optimise theIN
list to be a hashmap, rather than a list of predicates.
Solution 2
a between b and c
is a macro that expands to b <= a and a <= c
.
a in (b,c,d)
is a macro that expands to a=b or a=c or a=d
.
Assuming your n
and nk
are integer, both should end up meaning the same. The between
variant should be much faster because it's only two compares, versus nk - n
compares for the in
variant.
Solution 3
I have done research for this question. I have 11M rows in my table. I have executed two queries on that:
Query 1:SELECT * FROM PLAYERS WHERE SCORE BETWEEN 10 TO 20
Query 2:SELECT * FROM PLAYERS WHERE SCORE IN (10,11,...,20)
While execution time, both queries are translated as Andomar said above.
Among both queries, Query 1 is running faster than Query 2.
To know more follow this link:
Performance of BETWEEN VS IN() in MySQL
Thank you.
pr1001
Updated on July 09, 2022Comments
-
pr1001 almost 2 years
I have a set of consecutive rows I want to get based upon their primary key, which is an auto-incrementing integer. Assuming that there are no holes, is there any performance between between:
SELECT * FROM `theTable` WHERE `id` IN (n, ... nk);
and:
SELECT * FROM `theTable` WHERE `id` BETWEEN n AND nk;
-
Erick Robertson almost 14 yearsThe shorter string of the BETWEEN clause also parses more quickly.
-
pr1001 almost 14 yearsGreat, thanks. I'd give you the Answer right now but SO says I need to wait 7 minutes.
-
Code Commander over 11 years@LukasEder is right. Depending on your indexes,
IN
can turn out to be much faster. The best way to know is to benchmark both options in your particular case. -
Andomar over 11 yearsI would expect a range scan to be better than a unique scan for scanning ranges. Otherwise, why would Oracle implement the range scan at all?
-
eci over 11 yearsactually
a in (b,c,d)
is a macro which expands toa = any (b,c,d)
(see SQL-92 standard) -
John almost 4 yearsOn mysql IN can have serious performance impacts in comparison to between, I've seen stalls of many seconds where IN contained a few thousand numbers. The same query with BETWEEN takes a few milliseconds. So when you have the choice: always use between
-
Lukas Eder almost 4 years@John: Never say "always". If your
IN
list has 1-2 elements, I somewhat doubt that your claim is correct. -
John almost 4 years@LukasEder Between will be as fast as IN when using 2+ elements, so you lose nothing when using it. And when using 1 element you've no reason to use any of both. The 'always' stays valid no matter if 2 elements or 2 million, the more elements you target the more efficient it will be. There is no reason whatsoever to use IN when BETWEEN can achieve the same, both syntax variants have their purpose. When using IN as replacement for BETWEEN you rape it's purpose.
-
Aaron Francis over 2 yearsI believe this is a direct quote from "High Performance MySQL: Optimization, Backups, and Replication" is it not? If so, it should be noted as such!