SQL Server – performance and other stories: June 2013

Sunday, June 23, 2013

SQL Server Port - How to open in Windows Firewall

In the IT security world, it is said that approximately 80% of all security breaches occur by insiders; directly or indirectly, willingly or unwillingly. Here our focus is local Windows firewall and to see whether it is correctly configured or not. It is essential to make sure that every Windows and SQL Server is secured and protected.

Although SQL Server resides inside a DMZ, it is still important to secure the SQL box for anything unforeseen and there is no reason to disable the windows firewall. By opening a couple of TCP or UDP ports without disabling the Windows Firewall will guarantee maximum protection of a SQL Box.

In the following steps, we will configure Windows Firewall for the default port 1433 to allow traffic to pass through in Windows 2012 for SQL Server 2012.

Step by step guide:

1. In “Server Manager” select “Local Server”. (figure # 1)

2. From the “Tool” menu select “Windows Firewall with Advanced Security”.

3. Select the “Inbound Rules” from the left hand side. From the Action pane select “New Rule”. (figure # 2)

4. On the rule type dialog box, select “Port” and click next. (figure # 3)

5. Select “TCP” and enter 1433 in the “Specific Local Ports” box. (figure # 4)

6. On the next dialog box, select “Allow the connection”. (figure # 5)

7. On the profile dialog box, select all three options: “Domain”, “Private” and “Public”. (figure # 6)

8. On the last dialog box provide a name, e.g. “SQL Server Database Engine” and description. Then click finish button. (figure # 7)

9. On the “Windows Firewall with Advanced Security”, click “Enable rule”. (figure # 8)

Figure # 1: Windows Firewall configuration for SQL Server Port 1433

Figure # 2: Windows Firewall configuration for SQL Server Port 1433

Figure # 3: Windows Firewall configuration for SQL Server Port 1433

Figure # 4: Windows Firewall configuration for SQL Server Port 1433

Figure # 5: Windows Firewall configuration for SQL Server Port 1433

Figure # 6: Windows Firewall configuration for SQL Server Port 1433

Figure # 7: Windows Firewall configuration for SQL Server Port 1433

Figure # 8: Windows Firewall configuration for SQL Server Port 1433

Sunday, June 16, 2013

Execution Plan and “SQL Sentry Plan Explorer” – it's like a hand of god!

There are numerous articles that have been written on how out-dated statistics cause severe performance issues. I am hesitant to add one more to that long list. However, this time I’ll be using “SQL Sentry Plan Explorer” to present the issue. Please note that I am not a sales agent or representative of “SQL Sentry”, I wanted to share its simplistic interface and usability. This tool is free, extremely powerful and it brings all the qualities a SQL Server Expert ever needs to analyze an Execution Plan; especially the ones who work on query performance tuning.

Sometime ago, I received a request from a source to help them out on a query performance issue. The issue they were experiencing was that the duration of a business critical query is degrading; and now it takes up to 5 minutes to complete!

I asked them to send me the XML output of the actual execution plan for that query. I opened the received query plan in both “SQL Sentry Plan Explorer” and “SSMS”; and found the root cause almost immediately – which was cardinality estimation issue due to “out-dated statistics”. So the ultimate recommendation was to update statistics. After updating, the query took less than 4 seconds.

Here, I am sharing the outputs from “SQL Sentry Plan Explorer” to show you how intuitive the tool is to understand an execution plan and identify any plan quality problems immediately. Following are a couple of screenshots you may find interesting.

Visit: SQL Sentry Plan Explorer

http://www.sqlsentry.com/plan-explorer/sql-server-query-view.asp

Some Screenshots:

Thursday, June 13, 2013

Performance issues from ORDER BY/GROUP BY - spills in tempdb

It is very common and expected to see a query containing ORDER BY or GROUP BY clause for displaying or grouping purposes. It is also common that developers use ORDER BY clause from a habit without considering its necessity. As a result, queries become slower overtime as the number of records increases.

The Last Supper - by Leonardo da Vinci
Grouping example in Art

When a sort operation unable to acquire sufficient memory grant and it cannot be done in the memory and must happen in the tempdb. The heavier processing load inside the tempdb degrades overall SQL Server performance significantly. This situation is usually known as “spill to tempdb” or “spills in tempdb”. It is crucial to identify those sort warnings and avoid them whenever possible.

In my experience, I have seen ORDER BY/GROUP BY being used on a VARCHAR (8000) column while retrieving data; even unwisely used on a JOIN clause! Tweaking these queries is a bit tricky and most of the time it is impossible since the front-end application or business logic has already been built-in on this criterion. Creating an index on this column is not possible due to the 900 bytes restriction on an index key column. So, other than crossing fingers, there is nothing much to do to resolve the performance issue immediately.

Common Issues:

Following are some common issues that occur due to the misuse of ORDER BY/GROUP BY clause:

1. Rapid tempdb data file growth.

2. Increases disk I/O activities on tempdb and tempdb drive.

3. Introduces lock contention and escalation.

4. Increases memory grant for sort/hash operation.

5. Introduces parallel query plan.

Detecting the Issue:

Detecting performance issues that arise from sort operation is quite simple and straight forward. Following are some tips to identify issues:

1. Review the query and identify columns that are used on ORDER/GROUP clauses.

2. Review the Execution plan and identify “sort” operators.

3. Identify parallelism operators that perform the “distribute streams”, ”gather streams” and “repartition streams” in parallel execution plan.

4. Use SQL Profiler Trace event “sort warnings”.

5. Extended Event – “sort_warning”

6. Use PerfMom or sys.dm_os_performance to track “worktables created/sec” and “workfiles created/sec”

To resolve the performance Issue:

To resolve performance issues that occur from a sort operation, a couple of actions can be taken as follows:

1. Review the necessity of a sort operation in the query.

2. Try to perform a sort operation in the front-end.

3. Normalize the database schema.

4. Create single or multi-column indexes.

5. Apply filters on indexes.

6. Use TOP (n) when there is an “ORDER BY”, if possible.

7. Put more filters in the query to touch less data.

8. Update distribution statistics.

Observing the behavior:

To observe the common issues with ORDER BY/GROUP BY operations, let’s create a database, table and a simple select statement against 500,000 records.

CREATE DATABASE testDB

USE testDB

SET NOCOUNT ON

IF OBJECT_ID('tblLarge') IS NOT NULL

DROP TABLE tblLarge

CREATE TABLE tblLarge

(

xID INT IDENTITY(1, 1) ,

sName1 VARCHAR(100) ,

sName2 VARCHAR(1000) ,

sName3 VARCHAR(400) ,

sIdentifier CHAR(100) ,

dDOB DATETIME NULL ,

nWage NUMERIC(20, 2) ,

sLicense VARCHAR(25)

)

/*********************************

Add 500000 records

**********************************/

SET NOCOUNT ON

INSERT INTO tblLarge

( sName1 ,

sName2 ,

sName3 ,

sIdentifier ,

dDOB ,

nWage ,

sLicense

)

VALUES ( LEFT(CAST(NEWID() AS VARCHAR(36)), RAND() * 50) , -- sName1

LEFT(CAST(NEWID() AS VARCHAR(36)), RAND() * 60) , -- sName2

LEFT(CAST(NEWID() AS VARCHAR(36)), RAND() * 70) , -- sName2

LEFT(CAST(NEWID() AS VARCHAR(36)), 2) , -- sIdentifier

DATEADD(dd, -RAND() * 20000, GETDATE()) , -- dDOB

( RAND() * 1000 ) , -- nWage

SUBSTRING(CAST(NEWID() AS VARCHAR(36)), 6, 7) -- sLicense

)

GO 500000

/******************************************************

** Create a clustered index

******************************************************/

ALTER TABLE [tblLarge]

ADD CONSTRAINT [PK_tblLarge]

PRIMARY KEY CLUSTERED ([xID] ASC)

/***************************************************************

** To resolve the sort warning, create a non-clustered index

***************************************************************/

CREATE NONCLUSTERED INDEX [IX_sName1]

ON [tblLarge] ([sName1] ASC)

Simple SELECT Statement:

Following are some simple select statements to reproduce the behavior.

/******************************************************

** Simple select statement

******************************************************/

--First query

SELECT xID ,

sName1

FROM tblLarge

-- Second query with - ORDER BY

SELECT xID ,

sName1

FROM tblLarge

ORDER BY sName1

-- Third query - GROUP BY/ORDER BY

SELECT sName1 ,

COUNT(sName1) AS nCount

FROM tblLarge a

GROUP BY sName1

ORDER BY sName1

Using Extended Events (SQL 2012), SQL Profiler Trace and Execution Plan, “sort warning” are easily detectable and following are some outputs.

Figure#1: sort warning using Extended Events in SQL 2012

Figure#2A: sort warning detection using Execution Plan

Figure#2B: sort warning detection using Execution Plan

Figure#2: sort warning detection using SQL Profiler Trace

Learn More:

Understanding data vs. log usage for spills in tempdb

http://www.sqlskills.com/blogs/paul/understanding-data-vs-log-usage-for-spills-in-tempdb/

Query Memory Spills

http://www.sqlpassion.at/archive/2011/10/19/query-memory-spills/

Identifying and Solving Sort Warnings Problems in SQL Server

http://www.solidq.com/sqj/Pages/Relational/Identifying-and-Solving-Sort-Warnings-Problems-in-SQL-Server.aspx

Tuesday, June 11, 2013

MAXDOP - What triggers Parallelism?

Charles Darwin was an English naturalist. He established that all species of life have descended over time from common ancestors and proposed the scientific theory that this branching pattern of evolution resulted from a process that he called natural selection. I am sure that everybody knows this theory and it is well accepted.

How does the above analogy works with SQL Server Query Optimizer? Query Optimizer of SQL Server obeys Charles Darwin theory while creating an execution plan before executing the SQL Query. SQL Server query optimizer is a cost based optimizer, what it means is that before creating an execution plan it considers a number of facts and factors to produce a trivial and “good enough” plan. In the following short list, we can identify those:

1. Construction of the query.

2. Number of records, data length, and size of the table.

3. Appropriate indexes and up-to-date statistics.

4. I/O, CPU and Memory.

MAXDOP in OLTP:

In an OLTP environment, it is expectable that all queries and transactions are efficient and quick enough to finish its execution within 5 seconds. If it does not then SQL Server will take advantage of parallelism based on the query cost and MAXDOP setting.

There are a vast number of considerations, recommendations and concerns as well as what will be the settings of MAXDOP in OLTP environment? In OLTP implementation, it is expected that all queries have been written with performance in mind while adhering to the best practices. But in the real world this is not the case. Some queries are written poorly or are performing poorly because of the lack of appropriate indexes, out dated statistics, memory pressure, CPU bottleneck, slow I/O response, and so on.

How MAXDOP works?

MAXDOP is the maximum number of worker threads SQL Server Query Optimizer can use to execute a query. Each thread will go to each processor’s core during an execution of a parallel query. The MAXDOP = 0 (zero) means that the Query Optimizer is flexible to use the required number of threads to execute the query based on a set of predefined rules and a mechanism built-in in SQL Server.

Besides the server wide settings, Query hint (OPTION (MAXDOP n)) can be used to control parallel execution of a query. “Cost threshold for parallelism” is another server wide setting that can be utilized to control parallelism behavior.

"At execution time, if parallel execution is warranted, the Database Engine determines the optimal number of threads and spreads the execution of the parallel plan across those threads in its each execution. When a query or index operation starts executing on multiple threads for parallel execution, the same number of threads is used until the operation is completed. The Database Engine re-examines the optimal number of thread decisions every time an execution plan is retrieved from the plan cache. For example, one execution of a query can result in the use of a serial plan, a later execution of the same query can result in a parallel plan using three threads, and a third execution can result in a parallel plan using four threads".

What triggers Parallelism?

There are a couple different and specific reasons in OLTP system that triggers SQL Server to choose parallel execution of query to speedup data retrieval process. The following are a couple of important key factors for which SQL Server database engine chooses parallel query execution.

1. The query contains ORDER BY or GROUP BY clause. This means excessive sort operation. There are no appropriate indexes to support the sort operation.

2. Skewed data; meaning a column contains substantial number of duplicate records.

3. Memory grant is in-sufficient to execute the query. All sort operation requires extra memory and thus causes a “spill to tempdb”.

4. Not updated distribution statistics.

5. Processing huge number of records.

Symptoms and detecting Parallelism Issue:

Usually CXPACKET waittype can be used to monitor parallel query execution behavior in OLTP systems. But keep in mind that CXPACKET does not necessarily indicate that parallelism is an issue. This wait means that the parent thread is waiting to synchronize all output from the child threads. However, if you see SQL blocking on CXPACKET, it indicates that the SQL Server is facing resource contention such as lack of indexes, out-dated statistics, I/O and CPU bottleneck, parameter sniffing issue, excessive sort operation and so on.

Generally and as per SSMS implementation, the combined waits from EXCHANGE, EXECSYNC and CXPACKET can be used to measure and identify whether parallelism is an issue or not.

By increasing MAXDOP, if you see that the CPU usages goes high and the number of waiting tasks increases, this generally indicates that there is a parallelism issue. “Avg waiting tasks” in “Activity Monitor” can be used to observe the behavior quickly. Following simple queries are also good to observe the parallel threading behavior

SELECT SUM(s.runnable_tasks_count)

FROM    sys.dm_os_schedulers s
WHERE   s.[scheduler_id] < 255

SELECT wait_type ,
  waiting_tasks_count ,

( wait_time_ms - signal_wait_time_ms ) AS resource_wait_time_ms

FROM sys.dm_os_wait_stats

WHERE wait_type IN ( 'EXCHANGE', 'EXECSYNC', 'CXPACKET' )

ORDER BY resource_wait_time_ms DESC

You can also use my simple monitoring tool to detect and visualize parallelism issues. Please note that the excerpted scripts from SSMS were used to build this section.

Recommendations:

In OLTP systems using MAXDOP, value 1 is recommended by Microsoft and all the industry experts. However some queries will be benefited from higher value if you are not able to tune your queries, unable to create/update appropriate indexes or statistics. If you notice a fewer number of “Worker Threads” then MAXDOP =1 is more suitable and based on the workload it can be increased slowly.

Reference:

Degree of Parallelism
http://msdn.microsoft.com/en-us/library/ms188611(v=sql.105).aspx

Understanding and Controlling Parallel Query Processing in SQL Server

http://msdn.microsoft.com/en-us/library/gg415714.aspx

Total Pageviews