Total Pageviews

Tuesday, June 11, 2013

MAXDOP - What triggers Parallelism?

Charles Darwin was an English naturalist. He established that all species of life have descended over time from common ancestors and proposed the scientific theory that this branching pattern of evolution resulted from a process that he called natural selection. I am sure that everybody knows this theory and it is well accepted.

How does the above analogy works with SQL Server Query Optimizer? Query Optimizer of SQL Server obeys Charles Darwin theory while creating an execution plan before executing the SQL Query. SQL Server query optimizer is a cost based optimizer, what it means is that before creating an execution plan it considers a number of facts and factors to produce a trivial and “good enough” plan. In the following short list, we can identify those:

1.      Construction of the query.
2.      Number of records, data length, and size of the table.
3.      Appropriate indexes and up-to-date statistics.
4.      I/O, CPU and Memory.

In an OLTP environment, it is expectable that all queries and transactions are efficient and quick enough to finish its execution within 5 seconds. If it does not then SQL Server will take advantage of parallelism based on the query cost and MAXDOP setting.

There are a vast number of considerations, recommendations and concerns as well as what will be the settings of MAXDOP in OLTP environment? In OLTP implementation, it is expected that all queries have been written with performance in mind while adhering to the best practices. But in the real world this is not the case. Some queries are written poorly or are performing poorly because of the lack of appropriate indexes, out dated statistics, memory pressure, CPU bottleneck, slow I/O response, and so on.  

How MAXDOP works?
MAXDOP is the maximum number of worker threads SQL Server Query Optimizer can use to execute a query. Each thread will go to each processor’s core during an execution of a parallel query. The MAXDOP = 0 (zero) means that the Query Optimizer is flexible to use the required number of threads to execute the query based on a set of predefined rules and a mechanism built-in in SQL Server.  

Besides the server wide settings, Query hint (OPTION (MAXDOP n)) can be used to control parallel execution of a query. “Cost threshold for parallelism” is another server wide setting that can be utilized to control parallelism behavior. 

"At execution time, if parallel execution is warranted, the Database Engine determines the optimal number of threads and spreads the execution of the parallel plan across those threads in its each execution. When a query or index operation starts executing on multiple threads for parallel execution, the same number of threads is used until the operation is completed. The Database Engine re-examines the optimal number of thread decisions every time an execution plan is retrieved from the plan cache. For example, one execution of a query can result in the use of a serial plan, a later execution of the same query can result in a parallel plan using three threads, and a third execution can result in a parallel plan using four threads".

What triggers Parallelism?
There are a couple different and specific reasons in OLTP system that triggers SQL Server to choose parallel execution of query to speedup data retrieval process. The following are a couple of important key factors for which SQL Server database engine chooses parallel query execution. 

1.      The query contains ORDER BY or GROUP BY clause. This means excessive sort operation. There are no appropriate indexes to support the sort operation.
2.      Skewed data; meaning a column contains substantial number of duplicate records.
3.      Memory grant is in-sufficient to execute the query. All sort operation requires extra memory and thus causes a “spill to tempdb”.
4.      Not updated distribution statistics.
5.      Processing huge number of records.

Symptoms and detecting Parallelism Issue:
Usually CXPACKET waittype can be used to monitor parallel query execution behavior in OLTP systems. But keep in mind that CXPACKET does not necessarily indicate that parallelism is an issue. This wait means that the parent thread is waiting to synchronize all output from the child threads. However, if you see SQL blocking on CXPACKET, it indicates that the SQL Server is facing resource contention such as lack of indexes, out-dated statistics, I/O and CPU bottleneck, parameter sniffing issue, excessive sort operation and so on.

Generally and as per SSMS implementation, the combined waits from EXCHANGE, EXECSYNC and CXPACKET can be used to measure and identify whether parallelism is an issue or not.  

By increasing MAXDOP, if you see that the CPU usages goes high and the number of waiting tasks increases, this generally indicates that there is a parallelism issue. “Avg waiting tasks” in “Activity Monitor” can be used to observe the behavior quickly. Following simple queries are also good to observe the parallel threading behavior

SELECT  SUM(s.runnable_tasks_count)
FROM    sys.dm_os_schedulers s
WHERE   s.[scheduler_id] < 255

SELECT  wait_type ,
        waiting_tasks_count ,
        ( wait_time_ms - signal_wait_time_ms ) AS resource_wait_time_ms
FROM    sys.dm_os_wait_stats
ORDER BY resource_wait_time_ms DESC

You can also use my simple monitoring tool to detect and visualize parallelism issues. Please note that the excerpted scripts from SSMS were used to build this section.

In OLTP systems using MAXDOP, value 1 is recommended by Microsoft and all the industry experts. However some queries will be benefited from higher value if you are not able to tune your queries, unable to create/update appropriate indexes or statistics. If you notice a fewer number of “Worker Threads” then MAXDOP =1 is more suitable and based on the workload it can be increased slowly.  

Degree of Parallelism

Understanding and Controlling Parallel Query Processing in SQL Server

1 comment: