Self-join incurs more I/O activities and increases locking overhead!

Wednesday, July 3, 2013

Self-join incurs more I/O activities and increases locking overhead!

The Self-Join technique is commonly used to create a hierarchical tree set or finding duplicate records. It is also used to find previous or next values from a table by utilizing the inline table technique. Generally a self-join is a join in which a table is joined with itself. For example, when we need details about an employee and his manager where both employee and manager ID resides in the same table.

When we use self-join, it usually indicates that the table is not normalized. This may not be a problem for a small table but for a large and busy OLTP table with higher concurrency, this may lead to poor performance and degrade query response time.

The key point is that each self-joined table will be treated as a separate table while retrieving data from the disk, thus self-join incurs more I/O activities and increases locking overhead as it touches the same table twice or more. By adding appropriate indexes on JOIN and WHERE clause columns, it will reduce I/O activities and improve performance. It will be a good approach to avoid self-join whenever possible for heavy duty OLTP tables and queries.

Let’s perform a simple self-join test to see how the execution plan looks like. We can observe that as the retrieval number of records increases, each query behaves differently, such as requiring specific index and introducing parallel query.

Queries:

-- self-join one

select top (10) a.xid, a.sIdentifier

from tbllarge a

inner join tbllarge b on a.xid = b.xid

where a.sIdentifier ='A1'

-- self-join two

select top (100) a.xid, a.sIdentifier

from tbllarge a

inner join tbllarge b on a.xid = b.xid

where a.sIdentifier ='A1'

-- self-join three

select top (1000) a.xid, a.sIdentifier

from tbllarge a

inner join tbllarge b on a.xid = b.xid

where a.sIdentifier ='A1'

Execution Plan:

2 comments:

AnonymousAugust 2, 2013 at 11:44 AM
Short and Informative..Excellent Job...!!!!!!
ReplyDelete
Replies
AnonymousSeptember 14, 2013 at 4:34 PM
Very interesting, haven't thought much of the self join from a performance perspective, so good insight! A nice reference here also: sql self join
ReplyDelete
Replies

SQL Server – performance and other stories

Total Pageviews

Wednesday, July 3, 2013

Self-join incurs more I/O activities and increases locking overhead!

2 comments: