Purpose of using of IN() and Exist() is almost the same. Both MySQL functions are generally used with sub queries. To understand it better lets consider the following schema.
Table – users
Table – user_purchase
Suppose we need to find users who purchase something.
Select * from users where userid IN (select user_id from us_purchase group by user_id);
Select * FROM users as u where exists (select 1 from user_purchase as up where up.user_id = u.user_id);
Both query will return same results. So what is the difference and when to use whom. Lets compare it on basis of performance. Suppose we have 10000 of users and 10000 rows in user_purchase table. Whenever we use sub query we think that first sub query will be executed and saved as temporary table and this temporary table will be used to compare each rows. But mysql dosen’t work like that. In reality sub query will run for each row of outer query.
So have a look on our query with IN() . Our user_purchase has 10000 rows. So sub query will process 10000 rows for each user row. So total 10000*10000 rows will be processed to get result.
Now have a look on query with EXISTS(). Suppose each user has max 1 purchase. So our sub query will process one row per user row. So this query will process total 10000 to 20000 rows.
So it is clear that Exists() will return results faster than IN().
Consider the following cases –
If user_purchase table has index on user_id then execution time of exists() query is less because sub query will use this index. But with IN() sub query is not using this index so execution time is same.
If user_purchase table is empty. Our sub query will return null. The Exists keyword evaluates true or false, but IN() keyword compare all value in the corresponding sub query column. In our case IN() query will give syntax error. But Exist() give empty results.
Exists() is more faster then IN(). Always use Exists() whenever sub query result is large or null.