I am trying to understand a potential performance issue with our database (SQL 2008) and in particular one performance counter, SQLServer:Latches\\Total Latch Wait Time Total La
sp_configure 'max degree of parallelism', 8
go
reconfigure
go
This maybe a really basic error to professional DBA... but this is what I found with our high latch problem, and this thread ranks very high in search results. I thought I'd share our bit that it may help someone else.
on newer dual / multi processor server using NUMA memory architecture, the max degree of parallelism should be set to the actual core number per processor. in our example we had dual xenon with 4 cores each, and with hyper threading it appears as 16 logical processors to SQL.
Locking this value from the default 0 to 4 cut the high latch on some queries down immediately.
Our latch ran 1000ms+ up to 30,000ms on some occasions.
Reference taken from this blog:
Using sys.dm_db_index_operational_stats:
SELECT
OBJECT_NAME(object_id)
,page_latch_wait_count
,page_latch_wait_in_ms
,tree_page_latch_wait_count
,tree_page_latch_wait_in_ms
,Page_io_latch_wait_count
,Page_io_latch_wait_in_ms
FROM sys.dm_db_index_operational_stats (DB_ID(), NULL, NULL, NULL)
Using sys.dm_os_latch_stats:
SELECT * FROM sys.dm_os_latch_stats
WHERE latch_class = 'buffer'
I recommend you looke into sys.dm_os_latch_stats and see what type of latches have increased contention and wait types, compared to previous base-line.
If you see a spike in the BUFFER type latches it means it is driven by updates conflicting to modify the same page. Other latch types have also short explanation in the MSDN and can guide you toward the problem root cause. For those marked 'internal use only', you're going to have to open a support case with MS, as a detailed explanation of what they mean is on the verge of NDA.
You should also look into sys.dm_os_wait_stats. If you see an increase of PAGELATCH_*
, then it is the same problem as the BUFFER type latch above, contention in trying to modify same page, aka. as an update hot-spot. If you see an increase PAGEIOLATCH_*
then your problem is the I/O susbsytem, it takes too long to load the pages in memory when they are needed.