How can I perform a SQL 'NOT IN' query faster?

前端 未结 3 832
夕颜
夕颜 2021-01-01 18:35

I have a table (EMAIL) of email addresses:

EmailAddress
------------
jack@aol.com
jill@aol.com
tom@aol.com
bill@aol.lcom

and a table (BLACK

相关标签:
3条回答
  • 2021-01-01 19:00
    select E.EmailAddress
      from EMAIL E where not exists
             (select EmailAddress from BLACKLIST B where B.EmailAddress = E.EmailAddress)
    

    Equals (BTW there is probably an owner)

    select EmailAddress from mail.EMAIL 
    EXCEPT
    select EmailAddress from mail.BLACKLIST 
    

    will give you the rows that are different even if NULL in an EmailAddress

    0 讨论(0)
  • 2021-01-01 19:04

    NOT IN differs from NOT EXISTS if the blacklist allow null value as EmailAddress. If there is a single null value the result of the query will always return zero rows because NOT IN (null) is unknown / false for every value. The query plans therefore differs slighyly but I don't think there would be any serious performance impact.

    A suggestion is to create a new table called VALIDEMAIL, add a trigger to BLACKLIST that removes addresses from VALIDEMAIL when rows are inserted and add to VALIDEMAIL when removed from BLACKLIST. Then replace EMAIL with a view that is a union of both VALIDEMAIL and BLACKLIST.

    0 讨论(0)
  • 2021-01-01 19:12

    You can use a left outer join, or a not exists clause.

    Left outer join:

    select E.EmailAddress
      from EMAIL E left outer join BLACKLIST B on (E.EmailAddress = B.EmailAddress)
     where B.EmailAddress is null;
    

    Not Exists:

    select E.EmailAddress
      from EMAIL E where not exists
             (select EmailAddress from BLACKLIST B where B.EmailAddress = E.EmailAddress)
    

    Both are quite generic SQL solutions (don't depend on a specific DB engine). I would say that the latter is a little bit more performant (not by much though). But definitely more performant than the not in one.

    As commenters stated, you can also try creating an index on BLACKLIST(EmailAddress), that should help speed up the execution of your query.

    0 讨论(0)
提交回复
热议问题