SQL - What is the performance impact of having multiple CASE statements in SELECT - Teradata

后端未结

关注

 2  861

So I have a query that requires a bunch of CASE statements in the SELECT. This was not the orginal design but part of a compromise.

So the query looks something like thi

相关标签:

2条回答

夕颜

2021-02-07 02:46

Edit: Actually, you can re-factor both of those sub-queries into a JOIN, which would probably be faster, anyway. It gets rid of a lot of repetition, too!

This really isn't about the performance of the query ( @Gordon has that covered pretty well), but that huge case statement just seems like a maintenance nightmare. Maybe a better way to handle that would be to convert it to a table

CREATE TABLE ACCT_DISPLAY_NAME (
    FINC_ACCT_ID CHAR(10),
    BAL_TYPE_CD  CHAR(3),
    DISPLAY_NAME VARCHAR(100)
);

CREATE INDEX ACCT_DISPLAY_INDEX ON ACCT_DISPLAY_NAME (
    FINC_ACCT_ID,
    BAL_TYPE_CD
);

INSERT INTO ACCT_DISPLAY_NAME VALUES
('AC99800'  , 'EOP', '  Net Interest Income'               ),
('AC12993'  , 'EOP', '  Non Interest Income'               ),
('AC667999' , 'EOP', 'Non-Interest Expense'                ),
('AC996587' , 'EOP', '  Total Marketing Expense'           ),
('AC659986' , 'EOP', '  Total Operating Expense'           ),
('AC69678'  , 'EOP', 'Pre-Provision Earnings (before tax)' ),
('AC09994'  , 'EOP', '  Net Charge-offs'                   ),
('AC20977'  , 'EOP', '  Other'                             ),
('AC19979'  , 'EOP', '  Allowance Build (Release)'         ),
('AC7094'   , 'EOP', 'Provision Expense'                   ),
('AC6997'   , 'EOP', 'Pretax Income'                       ),
('AC0994'   , 'EOP', 'Tax Expense'                         ),
('AC9999'   , 'EOP', 'NIAT'                                ),
('AC7990'   , 'EOP', 'EPS'                                 ),
('AC9995'   , 'EOP', 'Ending Loans - HFI'                  ),
('AC9995'   , 'avg', 'Average Loans - HFI'                 ),
('AC2991'   , 'avg', 'Average Earning Assets'              ),
('AC2999'   , 'EOP', 'Ending Deposits'                     ),
('AC9999'   , 'avg', 'Average Deposits'                    ),
('AC0379'   , 'EOP', 'NIM on Loans'                        ),
('AC6999'   , 'EOP', 'Revenue Margin'                      ),
('AC579'    , 'EOP', 'Charge off rate'                     ),
('AC5899'   , 'EOP', 'Efficiency ratio'                    ),
('AC629'    , 'EOP', 'ROA'                                 ),
('AC359'    , 'EOP', 'ROE'                                 ),
('AC619'    , 'EOP', 'Return on Allocated Capital (ROAC)'  );

And do a LEFT JOIN on it (since you have that ELSE in the CASE), something like:

SELECT T.FINC_ACCT_NM,
       T.FINC_ACCT_ID,
       T.CURR_END_OF_PERD_ACTL_VAL,
       T.PREV_END_OF_PERD_ACTL_VAL,
       T.VARNC_PLAN_VAL,
       T.OUTLOOK_BDGT_PLAN_VAL,
       T.PERD_END_RPT_DT,
       T.PLAN_VERS_NM,
       T.FRMT_ACTL_CD,
       T.FRMT_PLAN_CD,
       T.RPT_PERD_TYPE_CD,
       COALESCE(N.DISPLAY_NAME, T.FINC_ACCT_NM)

FROM CONT.TABLE T
JOIN (
    SELECT RPT_PERD_TYPE_CD, DATA_VLDTN_IND, Max(Perd_END_RPT_DT) AS PERD_END_RPT_DT
    FROM CONT.TABLE
    WHERE VERS_NM='Actual'
      AND DATA_VLDTN_IND='Y'
    GROUP BY RPT_PERD_TYPE_CD, DATA_VLDTN_IND
) AS MAX_DATES
  ON T.RPT_PERD_TYPE_CD = MAX_DATES.RPT_PERD_TYPE_CD
 AND T.DATA_VLDTN_IND   = MAX_DATES.DATA_VLDTN_IND 
 AND T.PERD_END_RPT_DT  = MAX_DATES.PERD_END_RPT_DT 

LEFT JOIN ACCT_DISPLAY_NAME N
  ON T.FINC_ACCT_ID = N.FINC_ACCT_ID
 AND T.BAL_TYPE_CD  = N.BAL_TYPE_CD

WHERE T.DEPT_ID = 'OR80637'

  AND T.RPT_PERD_TYPE_CD IN ('Q', 'M')

  AND T.FINC_ACCT_ID IN (
    'AC0006470',
    'AC8000199',
    'AC8002145',
    'AC0006586',
    'AC8000094'
  )

0 讨论(0)

不要未来只要你来

2021-02-07 03:00

The case statements are going to be much less of a factor than the joins in the WHERE clause.

The main driver of performance in SQL is I/O -- reading the data from disk. I think of it as two orders of magnitude more important than the processing going on in rows. This is just a heuristic, not based on specific tests on a database.

You are doing self-joins, which will require either lots of work reading the table or a fair amount of work dealing with indexes.

The case statement, on the other hand, gets turned into very primitive hardware commands -- equals, gotos, and the like. The data resides in memory closest to the processors, so it is going to zip along. You are doing nothing fancy in the case statement (such as a like or a subquery). I would imagine that the query would be just as fast if you removed most of the lines in the statement.

If you are having issues with performance, put an index on (VERS_NM, RPT_PERD_TYPE_CD, DATA_VLDTN_IND, Perd_END_RPT_DT). This four-part index should allow you to get the max date without invoking I/O requests on the original table.

0 讨论(0)
发布评论:

提交评论
- 加载中...