When should I use cross apply over inner join?

前端 未结 14 1056
忘了有多久
忘了有多久 2020-11-22 06:51

What is the main purpose of using CROSS APPLY?

I have read (vaguely, through posts on the Internet) that cross apply can be more efficient when selectin

相关标签:
14条回答
  • 2020-11-22 07:21

    While most queries which employ CROSS APPLY can be rewritten using an INNER JOIN, CROSS APPLY can yield better execution plan and better performance, since it can limit the set being joined yet before the join occurs.

    Stolen from Here

    0 讨论(0)
  • 2020-11-22 07:22

    Consider you have two tables.

    MASTER TABLE

    x------x--------------------x
    | Id   |        Name        |
    x------x--------------------x
    |  1   |          A         |
    |  2   |          B         |
    |  3   |          C         |
    x------x--------------------x
    

    DETAILS TABLE

    x------x--------------------x-------x
    | Id   |      PERIOD        |   QTY |
    x------x--------------------x-------x
    |  1   |   2014-01-13       |   10  |
    |  1   |   2014-01-11       |   15  |
    |  1   |   2014-01-12       |   20  |
    |  2   |   2014-01-06       |   30  |
    |  2   |   2014-01-08       |   40  |
    x------x--------------------x-------x
    

    There are many situations where we need to replace INNER JOIN with CROSS APPLY.

    1. Join two tables based on TOP n results

    Consider if we need to select Id and Name from Master and last two dates for each Id from Details table.

    SELECT M.ID,M.NAME,D.PERIOD,D.QTY
    FROM MASTER M
    INNER JOIN
    (
        SELECT TOP 2 ID, PERIOD,QTY 
        FROM DETAILS D      
        ORDER BY CAST(PERIOD AS DATE)DESC
    )D
    ON M.ID=D.ID
    
    • SQL FIDDLE

    The above query generates the following result.

    x------x---------x--------------x-------x
    |  Id  |   Name  |   PERIOD     |  QTY  |
    x------x---------x--------------x-------x
    |   1  |   A     | 2014-01-13   |  10   |
    |   1  |   A     | 2014-01-12   |  20   |
    x------x---------x--------------x-------x
    

    See, it generated results for last two dates with last two date's Id and then joined these records only in the outer query on Id, which is wrong. This should be returning both Ids 1 and 2 but it returned only 1 because 1 has the last two dates. To accomplish this, we need to use CROSS APPLY.

    SELECT M.ID,M.NAME,D.PERIOD,D.QTY
    FROM MASTER M
    CROSS APPLY
    (
        SELECT TOP 2 ID, PERIOD,QTY 
        FROM DETAILS D  
        WHERE M.ID=D.ID
        ORDER BY CAST(PERIOD AS DATE)DESC
    )D
    
    • SQL FIDDLE

    and forms the following result.

    x------x---------x--------------x-------x
    |  Id  |   Name  |   PERIOD     |  QTY  |
    x------x---------x--------------x-------x
    |   1  |   A     | 2014-01-13   |  10   |
    |   1  |   A     | 2014-01-12   |  20   |
    |   2  |   B     | 2014-01-08   |  40   |
    |   2  |   B     | 2014-01-06   |  30   |
    x------x---------x--------------x-------x
    

    Here's how it works. The query inside CROSS APPLY can reference the outer table, where INNER JOIN cannot do this (it throws compile error). When finding the last two dates, joining is done inside CROSS APPLY i.e., WHERE M.ID=D.ID.

    2. When we need INNER JOIN functionality using functions.

    CROSS APPLY can be used as a replacement with INNER JOIN when we need to get result from Master table and a function.

    SELECT M.ID,M.NAME,C.PERIOD,C.QTY
    FROM MASTER M
    CROSS APPLY dbo.FnGetQty(M.ID) C
    

    And here is the function

    CREATE FUNCTION FnGetQty 
    (   
        @Id INT 
    )
    RETURNS TABLE 
    AS
    RETURN 
    (
        SELECT ID,PERIOD,QTY 
        FROM DETAILS
        WHERE ID=@Id
    )
    
    • SQL FIDDLE

    which generated the following result

    x------x---------x--------------x-------x
    |  Id  |   Name  |   PERIOD     |  QTY  |
    x------x---------x--------------x-------x
    |   1  |   A     | 2014-01-13   |  10   |
    |   1  |   A     | 2014-01-11   |  15   |
    |   1  |   A     | 2014-01-12   |  20   |
    |   2  |   B     | 2014-01-06   |  30   |
    |   2  |   B     | 2014-01-08   |  40   |
    x------x---------x--------------x-------x
    

    ADDITIONAL ADVANTAGE OF CROSS APPLY

    APPLY can be used as a replacement for UNPIVOT. Either CROSS APPLY or OUTER APPLY can be used here, which are interchangeable.

    Consider you have the below table(named MYTABLE).

    x------x-------------x--------------x
    |  Id  |   FROMDATE  |   TODATE     |
    x------x-------------x--------------x
    |   1  |  2014-01-11 | 2014-01-13   | 
    |   1  |  2014-02-23 | 2014-02-27   | 
    |   2  |  2014-05-06 | 2014-05-30   | 
    |   3  |     NULL    |    NULL      |
    x------x-------------x--------------x
    

    The query is below.

    SELECT DISTINCT ID,DATES
    FROM MYTABLE 
    CROSS APPLY(VALUES (FROMDATE),(TODATE))
    COLUMNNAMES(DATES)
    
    • SQL FIDDLE

    which brings you the result

      x------x-------------x
      | Id   |    DATES    |
      x------x-------------x
      |  1   |  2014-01-11 |
      |  1   |  2014-01-13 |
      |  1   |  2014-02-23 |
      |  1   |  2014-02-27 |
      |  2   |  2014-05-06 |
      |  2   |  2014-05-30 | 
      |  3   |    NULL     | 
      x------x-------------x
    
    0 讨论(0)
  • 2020-11-22 07:22

    Here's a brief tutorial that can be saved in a .sql file and executed in SSMS that I wrote for myself to quickly refresh my memory on how CROSS APPLY works and when to use it:

    -- Here's the key to understanding CROSS APPLY: despite the totally different name, think of it as being like an advanced 'basic join'.
    -- A 'basic join' gives the Cartesian product of the rows in the tables on both sides of the join: all rows on the left joined with all rows on the right.
    -- The formal name of this join in SQL is a CROSS JOIN.  You now start to understand why they named the operator CROSS APPLY.
    
    -- Given the following (very) simple tables and data:
    CREATE TABLE #TempStrings ([SomeString] [nvarchar](10) NOT NULL);
    CREATE TABLE #TempNumbers ([SomeNumber] [int] NOT NULL);
    CREATE TABLE #TempNumbers2 ([SomeNumber] [int] NOT NULL);
    INSERT INTO #TempStrings VALUES ('111'); INSERT INTO #TempStrings VALUES ('222');
    INSERT INTO #TempNumbers VALUES (111); INSERT INTO #TempNumbers VALUES (222);
    INSERT INTO #TempNumbers2 VALUES (111); INSERT INTO #TempNumbers2 VALUES (222); INSERT INTO #TempNumbers2 VALUES (222);
    
    -- Basic join is like CROSS APPLY; 2 rows on each side gives us an output of 4 rows, but 2 rows on the left and 0 on the right gives us an output of 0 rows:
    SELECT
        st.SomeString, nbr.SomeNumber
    FROM -- Basic join ('CROSS JOIN')
        #TempStrings st, #TempNumbers nbr
        -- Note: this also works:
        --#TempStrings st CROSS JOIN #TempNumbers nbr
    
    -- Basic join can be used to achieve the functionality of INNER JOIN by first generating all row combinations and then whittling them down with a WHERE clause:
    SELECT
        st.SomeString, nbr.SomeNumber
    FROM -- Basic join ('CROSS JOIN')
        #TempStrings st, #TempNumbers nbr
    WHERE
        st.SomeString = nbr.SomeNumber
    
    -- However, for increased readability, the SQL standard introduced the INNER JOIN ... ON syntax for increased clarity; it brings the columns that two tables are
    -- being joined on next to the JOIN clause, rather than having them later on in the WHERE clause.  When multiple tables are being joined together, this makes it
    -- much easier to read which columns are being joined on which tables; but make no mistake, the following syntax is *semantically identical* to the above syntax:
    SELECT
        st.SomeString, nbr.SomeNumber
    FROM -- Inner join
        #TempStrings st INNER JOIN #TempNumbers nbr ON st.SomeString = nbr.SomeNumber
    
    -- Because CROSS APPLY is generally used with a subquery, the subquery's WHERE clause will appear next to the join clause (CROSS APPLY), much like the aforementioned
    -- 'ON' keyword appears next to the INNER JOIN clause.  In this sense, then, CROSS APPLY combined with a subquery that has a WHERE clause is like an INNER JOIN with
    -- an ON keyword, but more powerful because it can be used with subqueries (or table-valued functions, where said WHERE clause can be hidden inside the function).
    SELECT
        st.SomeString, nbr.SomeNumber
    FROM
        #TempStrings st CROSS APPLY (SELECT * FROM #TempNumbers tempNbr WHERE st.SomeString = tempNbr.SomeNumber) nbr
    
    -- CROSS APPLY joins in the same way as a CROSS JOIN, but what is joined can be a subquery or table-valued function.  You'll still get 0 rows of output if
    -- there are 0 rows on either side, and in this sense it's like an INNER JOIN:
    SELECT
        st.SomeString, nbr.SomeNumber
    FROM
        #TempStrings st CROSS APPLY (SELECT * FROM #TempNumbers tempNbr WHERE 1 = 2) nbr
    
    -- OUTER APPLY is like CROSS APPLY, except that if one side of the join has 0 rows, you'll get the values of the side that has rows, with NULL values for
    -- the other side's columns.  In this sense it's like a FULL OUTER JOIN:
    SELECT
        st.SomeString, nbr.SomeNumber
    FROM
        #TempStrings st OUTER APPLY (SELECT * FROM #TempNumbers tempNbr WHERE 1 = 2) nbr
    
    -- One thing CROSS APPLY makes it easy to do is to use a subquery where you would usually have to use GROUP BY with aggregate functions in the SELECT list.
    -- In the following example, we can get an aggregate of string values from a second table based on matching one of its columns with a value from the first
    -- table - something that would have had to be done in the ON clause of the LEFT JOIN - but because we're now using a subquery thanks to CROSS APPLY, we
    -- don't need to worry about GROUP BY in the main query and so we don't have to put all the SELECT values inside an aggregate function like MIN().
    SELECT
        st.SomeString, nbr.SomeNumbers
    FROM
        #TempStrings st CROSS APPLY (SELECT SomeNumbers = STRING_AGG(tempNbr.SomeNumber, ', ') FROM #TempNumbers2 tempNbr WHERE st.SomeString = tempNbr.SomeNumber) nbr
    -- ^ First the subquery is whittled down with the WHERE clause, then the aggregate function is applied with no GROUP BY clause; this means all rows are
    --   grouped into one, and the aggregate function aggregates them all, in this case building a comma-delimited string containing their values.
    
    DROP TABLE #TempStrings;
    DROP TABLE #TempNumbers;
    DROP TABLE #TempNumbers2;
    
    0 讨论(0)
  • 2020-11-22 07:23

    Cross apply can be used to replace subquery's where you need a column of the subquery

    subquery

    select * from person p where
    p.companyId in(select c.companyId from company c where c.companyname like '%yyy%')
    

    here i won't be able to select the columns of company table so, using cross apply

    select P.*,T.CompanyName
    from Person p
    cross apply (
        select *
        from Company C
        where p.companyid = c.companyId and c.CompanyName like '%yyy%'
    ) T
    
    0 讨论(0)
  • 2020-11-22 07:26

    This has already been answered very well technically, but let me give a concrete example of how it's extremely useful:

    Lets say you have two tables, Customer and Order. Customers have many Orders.

    I want to create a view that gives me details about customers, and the most recent order they've made. With just JOINS, this would require some self-joins and aggregation which isn't pretty. But with Cross Apply, its super easy:

    SELECT *
    FROM Customer
    CROSS APPLY (
      SELECT TOP 1 *
      FROM Order
      WHERE Order.CustomerId = Customer.CustomerId
      ORDER BY OrderDate DESC
    ) T
    
    0 讨论(0)
  • 2020-11-22 07:36

    It seems to me that CROSS APPLY can fill a certain gap when working with calculated fields in complex/nested queries, and make them simpler and more readable.

    Simple example: you have a DoB and you want to present multiple age-related fields that will also rely on other data sources (such as employment), like Age, AgeGroup, AgeAtHiring, MinimumRetirementDate, etc. for use in your end-user application (Excel PivotTables, for example).

    Options are limited and rarely elegant:

    • JOIN subqueries cannot introduce new values in the dataset based on data in the parent query (it must stand on its own).

    • UDFs are neat, but slow as they tend to prevent parallel operations. And being a separate entity can be a good (less code) or a bad (where is the code) thing.

    • Junction tables. Sometimes they can work, but soon enough you're joining subqueries with tons of UNIONs. Big mess.

    • Create yet another single-purpose view, assuming your calculations don't require data obtained mid-way through your main query.

    • Intermediary tables. Yes... that usually works, and often a good option as they can be indexed and fast, but performance can also drop due to to UPDATE statements not being parallel and not allowing to cascade formulas (reuse results) to update several fields within the same statement. And sometimes you'd just prefer to do things in one pass.

    • Nesting queries. Yes at any point you can put parenthesis on your entire query and use it as a subquery upon which you can manipulate source data and calculated fields alike. But you can only do this so much before it gets ugly. Very ugly.

    • Repeating code. What is the greatest value of 3 long (CASE...ELSE...END) statements? That's gonna be readable!

      • Tell your clients to calculate the damn things themselves.

    Did I miss something? Probably, so feel free to comment. But hey, CROSS APPLY is like a godsend in such situations: you just add a simple CROSS APPLY (select tbl.value + 1 as someFormula) as crossTbl and voilà! Your new field is now ready for use practically like it had always been there in your source data.

    Values introduced through CROSS APPLY can...

    • be used to create one or multiple calculated fields without adding performance, complexity or readability issues to the mix
    • like with JOINs, several subsequent CROSS APPLY statements can refer to themselves: CROSS APPLY (select crossTbl.someFormula + 1 as someMoreFormula) as crossTbl2
    • you can use values introduced by a CROSS APPLY in subsequent JOIN conditions
    • As a bonus, there's the Table-valued function aspect

    Dang, there's nothing they can't do!

    0 讨论(0)
提交回复
热议问题