SQL Sub-query or INNER-JOIN?

后端 未结 4 930
执笔经年
执笔经年 2021-01-13 05:38

I\'ve the two following queries:

declare @UserId as int
set @UserId = 1

-- Query #1: Sub-query
SELECT
    u.[Id] ,
    u.[Name] ,
    u.[OrgId] AS Organizat         


        
相关标签:
4条回答
  • 2021-01-13 06:19

    join is faster than subquery.

    subquery makes for busy disk access, think of hard disk's read-write needle(head?) that goes back and forth when it access: User, SearchExpression, PageSize, DrilldownPageSize, User, SearchExpression, PageSize, DrilldownPageSize, User... and so on.

    join works by concentrating the operation on the result of the first two tables, any subsequent joins would concentrate joining on the in-memory(or cached to disk) result of the first joined tables, and so on. less read-write needle movement, thus faster

    0 讨论(0)
  • 2021-01-13 06:34

    The best thing you can do is try both and compare what gives you the best performance. It's difficult to second guess what the query optimiser will do (you could write 2 different queries that actually end up being optimised to the same execution plan).

    To compare performance fairly, you should ensure you try them from a level playing field by clearing down the execution plan and data cache before trying each one. This can be done using the following commands, though only do this on a development/test db server:

    DBCC FREEPROCCACHE
    DBCC DROPCLEANBUFFERS
    

    The approach I usually take is to run each query 3 times, with SQL Profiler running so I can monitor the duration, reads, CPU and writes of the query which I then base my decision on.

    e.g.
    1) clear cache using above commands
    2) run query and record stats
    3) clear cache
    4) run query again
    5) run query again (this will use cached execution plan/data)

    Then repeat for the second query to compare.

    0 讨论(0)
  • 2021-01-13 06:34

    It will depend largely on the cardinality of your data: if your in-line lookups are minimal compared to the overhead of joining vast amounts of data (when you only need to extract a small subsection from that join result), then the inline option will be quicker. But if you are having substantial overhead with in-line selects (i.e. if your result has a large number of rows, and you are calling out an inline select for each and every row), then the join will be quicker.

    I can't see from yourquestion the numbers involved (i.e. how many rows) so it's hard to make qualitative comment.

    For example, if your result set has 10 rows, then the inline selects will be carried out for each of those ten rows only, whereas the join might be involving far more rows, which are then selectively reduced by WHERE clauses. But if you have a resultset of 10 million rows, the inline selects will most probably kill performance, since its row-by-row.

    EXAMPLE: imagine you have to collect a load of bricks (specified by size etc.) from all over a building yard and paint them blue.

    inline select = Selecting all the bricks you need and then painting them by hand.

    join = dump all the bricks into a huge bucket of paint, anf then choose the ones you need

    If you only want to end up with 10 bricks, it is far quicker to select and then paint by hand. If you want a million bricks, then mass-painting them in a tub first is the way to go.

    0 讨论(0)
  • 2021-01-13 06:37

    The relative cost of an execution plan isn't always a reliable indicator of performance.

    I assume from your sql that only 1 row should be returned. Providing that the UserId is a unique key on User, then the performance of your 2 approaches will be similar on most relational databases.

    Things to bear in mind would be:

    • if UserPreferences or SearchCriteria return more than 1 row, the first approach will raise an sql error, the second approach will return more than 1 row.
    • the apparent extra lookup in the first approach (UserPreferences selected twice) has no real effect because for the second lookup the record will already be in a buffer
    • if for some reason the User table is tablespace scanned, the first approach will be much faster
    0 讨论(0)
提交回复
热议问题