We are working to port existing applications to Azure SQL Data Warehouse. In order to better understand the performance/workload management characteristics/capabilities of
At DWU 1000 you get, 32 max concurrent queries and 40 concurrency slots, so some of your queries are going to have to queue.
What indexing and distribution choices have you made? This table is small so it sounds like a better candidate for a clustered index and not clustered columnstore (the default). Also make sure you have created your statistics.
Where are you calling sqlcmd from, eg an Azure VM so it's "closer" to the DW, or from your laptop in which case you may be waiting for network round trips.
Review the concurrency DMV:sys.dm_pdw_exec_requests
Review the waits DMVs:sys.dm_pdw_waits
This recent answer looks useful too.
I have done an annotation of your sample EXPLAIN
plan. turn on line numbers in SSMS or view in something like Sublime text for best effect:
RND_ID
, or RandomIdOperation. This operation simply creates a unique name for temporary objects used in the query plan. The identifier is TEMP_ID_21523.ON
, or OnOperation. This performs an action on a database or object. This particular step creates a temp table [TEMP_ID_21523] on all nodes as specified in line 9. The DDL to create the temp table on all nodes is on line 11. This temp table only has one column, called 'col' of datatype DATE.SHUFFLE_MOVE
, or ShuffleMoveOperation. SHUFFLE_MOVE redistributes a distributed table. ON
, or OnOperation. This operation creates another temp table on the control node, with one BIGINT column. The DDL for this table is provided on line 25.PARTITION_MOVE
, or PartitionMoveOperation. This DMS operation moves data from a distributed table to a single table on the Control node. This operation is used for aggregation operations on the Control node. This particular step moves data from temp table [TEMP_ID_21523] which exists on all nodes to destination temp table [QTable_3ff2...] which is on the control node. ON
, or OnOperation. This step drops the temp table [TEMP_ID_21523] which exists on all nodes.RETURN
or ReturnOperation. This operation which occurs on the control node, sends query results from the control node to the user who submitted the query. The SQL returned is shown in lines 61-67.ON
, or OnOperation. This particular step drops the temp table [QTable_3ff2...] which exists on the control node.For your query, the PARTITION_MOVE
or SHUFFLE_MOVE
step are the most likely causes for performance issues and improving performance would involve removing or improving them.
To go any further I would need to know the DDL for the table [AR_CORE_DIM_TABLES].[calendar_dim] and the view [AR_WM_VM].[CALENDAR_DAY GROUP] so I can work out the distribution and if any calculated columns are being used.
This annotation is based on a similar one in the APS help file sections on EXPLAIN
plans and Understanding Query Plans
where some of the text is copied from. I have adapted it for your plan.