What is the difference between UNION and UNION ALL?

后端 未结 26 2209
伪装坚强ぢ
伪装坚强ぢ 2020-11-21 11:28

What is the difference between UNION and UNION ALL?

相关标签:
26条回答
  • 2020-11-21 12:05

    I add an example,

    UNION, it is merging with distinct --> slower, because it need comparing (In Oracle SQL developer, choose query, press F10 to see cost analysis).

    UNION ALL, it is merging without distinct --> faster.

    SELECT to_date(sysdate, 'yyyy-mm-dd') FROM dual
    UNION
    SELECT to_date(sysdate, 'yyyy-mm-dd') FROM dual;
    

    and

    SELECT to_date(sysdate, 'yyyy-mm-dd') FROM dual
    UNION ALL
    SELECT to_date(sysdate, 'yyyy-mm-dd') FROM dual;
    
    0 讨论(0)
  • 2020-11-21 12:06

    UNION - results in distinct records

    while

    UNION ALL - results in all the records including duplicates.

    Both are blocking operators and hence I personally prefer using JOINS over Blocking Operators(UNION, INTERSECT, UNION ALL etc. ) anytime.

    To illustrate why Union operation performs poorly in comparison to Union All checkout the following example.

    CREATE TABLE #T1 (data VARCHAR(10))
    
    INSERT INTO #T1
    SELECT 'abc'
    UNION ALL
    SELECT 'bcd'
    UNION ALL
    SELECT 'cde'
    UNION ALL
    SELECT 'def'
    UNION ALL
    SELECT 'efg'
    
    
    CREATE TABLE #T2 (data VARCHAR(10))
    
    INSERT INTO #T2
    SELECT 'abc'
    UNION ALL
    SELECT 'cde'
    UNION ALL
    SELECT 'efg'
    

    Following are results of UNION ALL and UNION operations.

    A UNION statement effectively does a SELECT DISTINCT on the results set. If you know that all the records returned are unique from your union, use UNION ALL instead, it gives faster results.

    Using UNION results in Distinct Sort operations in the Execution Plan. Proof to prove this statement is shown below:

    0 讨论(0)
  • UNION removes duplicate records in other hand UNION ALL does not. But one need to check the bulk of data that is going to be processed and the column and data type must be same.

    since union internally uses "distinct" behavior to select the rows hence it is more costly in terms of time and performance. like

    select project_id from t_project
    union
    select project_id from t_project_contact  
    

    this gives me 2020 records

    on other hand

    select project_id from t_project
    union all
    select project_id from t_project_contact
    

    gives me more than 17402 rows

    on precedence perspective both has same precedence.

    0 讨论(0)
  • 2020-11-21 12:10

    As a habit, Always use UNION ALL. Use only UNION in special cases when you need to eliminate duplicates which can be extremely messy and you can read all about in the other comments here.

    0 讨论(0)
  • 2020-11-21 12:12

    UNION removes duplicates, whereas UNION ALL does not.

    In order to remove duplicates the result set must be sorted, and this may have an impact on the performance of the UNION, depending on the volume of data being sorted, and the settings of various RDBMS parameters ( For Oracle PGA_AGGREGATE_TARGET with WORKAREA_SIZE_POLICY=AUTO or SORT_AREA_SIZE and SOR_AREA_RETAINED_SIZE if WORKAREA_SIZE_POLICY=MANUAL ).

    Basically, the sort is faster if it can be carried out in memory, but the same caveat about the volume of data applies.

    Of course, if you need data returned without duplicates then you must use UNION, depending on the source of your data.

    I would have commented on the first post to qualify the "is much less performant" comment, but have insufficient reputation (points) to do so.

    0 讨论(0)
  • 2020-11-21 12:13

    If there is no ORDER BY, a UNION ALL may bring rows back as it goes, whereas a UNION would make you wait until the very end of the query before giving you the whole result set at once. This can make a difference in a time-out situation - a UNION ALL keeps the connection alive, as it were.

    So if you have a time-out issue, and there's no sorting, and duplicates aren't an issue, UNION ALL may be rather helpful.

    0 讨论(0)
提交回复
热议问题