Home > Grade of the Steel, Oracle, SQL Server > Thread Synchronization in SQL Server

Thread Synchronization in SQL Server

November 9, 2011 Leave a comment Go to comments

Any code optimized for highly concurrent workloads must worry about thread synchronization. SQL Server is no exception, because in a database system, synchronization is one of the core functionalities you rely on the engine to provide (the noSQL crowd may ponder that a bit). In this blog post, I will describe the synchronization primitives, like locks, latches and spinlocks, used by SQL Server to coordinate access to memory regions between threads.

imageBefore I go on, an update: am currently sitting in the train from Stockholm after a hugely successful PASS SQL Rally Nordic conference.

At the conference, I presented my new “Grade of the Steel” deck, which I hope will be available online soon. The deck led to a lot of questions about spinlocks during the party that same evening.

We had Swedish legend E-Type playing and putting on an impressive stage show (including dancing Vikings, as seen in picture to the right). The show left me wondering how to improve my presentations skills to get THAT level of reach to the audience. The conference arrangers had  also done an amazing job at keeping everything running smoothly and I return with many good memories. Thank you guys, you showed the crowd that the Nordic community is very strong.

Well, “here I go again”, being distracted by all the fun. My point was, the questions at the conference led me to write this blog entry, as your (always not so) humble servant. Keep that feedback coming.

Locks and Lock Graphs

Most DBAs and developers are familiar with locking and blocking: Locks in SQL Server are the semaphores that guard the consistency of the database and enforce the high level ACID properties and serialization properties in the database.

You can measure locking activity in several DMV, and there are many articles written on how to diagnose and troubleshoot locking issues. For this blog, suffice to say that the total time the entire engine spends on waiting for locks is measured in sys.dm_os_wait_stats. I would like to make the point that EVERY performance investigation I do on SQL Server stats with querying this view… Learn to use it.

In sys.dm_os_wait_stats, you will find rows that have wait_type beginning with LCK. These are the lock waits, and they have been extensively described elsewhere.

The one thing I would like to add, is that locks have a directed acyclic graph relationship (DAG) with each other and are generally acquired in a certain order. Let us take a hypothetical example of a query locking five rows in a table with two partitions:

image

Before the query can even start, it needs a database (DB) lock to even run, this lock is acquired by the connection itself and held as long as you are connected to the database.

The first locks acquired by the query itself, are the two schema locks (SCH in shared mode S): one of the table itself, and one for the schema the table lives in. Note that the term “schema” is a bit overloaded here: I mean to say that the lock is acquired both on the table (the table object in the database schema) and on the schema (the schema object in the database schema). The SCH locks prevent the table structure itself from being modified while the query runs. The requirement to hold this SCH lock is why, for example a SWITCH operation, will block a person from reading the table while the switch happens.

When SCH locks are acquired, the query must hold a lock on the table. This prevents other conflicting locks, for example a bulk inserter or a greedy, table escalated X-lock, from conflicting with the query.

Traversing onwards from the table level, we have a special lock level: HOBT, representing a partition (HOBT = Heap Or BTree, pronounced: “Hobbit”). Recall that even an unpartitioned table has one partition: the table itself. The HOBT lock is only acquired if the table has been configured to support this lock level. This can be done with this statement:

ALTER TABLE <table> SET (LOCK_ESCALATION = AUTO)

 

… note that the default for tables is to have this lock type disabled and to jump directly to…

Row and page locks: Queries will generally lock the pages it needs access to, though this is mostly a performance optimization used for table scans. Row locks fall into two different categories, RID or Key, depending on whether it is a heap or a B-tree you are locking. Note that you can disable row and page locks by altering the table like this for a B-tree:

ALTER INDEX <index> ON <table>
SET (
   
ALLOW_ROW_LOCKS = OFF
  , ALLOW_PAGE_LOCKS = OFF)

 

There are cases where disabling the row and page lock levels make queries go faster, but it can also have a negative effect on concurrency. I hope to blog about those cases, and how to diagnose them, in a later installment.

I said before that locks are acquired in a the order of the directed acyclic graph by the SQL engine. This is not something you need to worry about, the engine handles this for you. But, can you guess why it matters which order locks get acquired?

(Hint: What would happen if someone started at the top at the same time that someone else started from the bottom of the graph?)

Locks in SQL Server are stored in hash tables in the part of the engine called the lock manager. These hash tables consume some memory (but so little we don’t really care about it, so Oracle people, stop bickering this point) and are optimized for highly concurrent access. The lock manager even partitions locks over multiple NUMA nodes to make accessing lock information faster. The mechanism used to partition locks are similar to the SuperLatches described in: Hot It Works: SQL Server SuperLatch’ing / Sub-latches.

Latches

Latches are a special type of lock that is used inside the SQL Server engine to protect shared data structures. They enforce integrity at the physical layer of the database and will be used automatically by SQL Server, irrespective of the transactional semantics you request.

The wait time for latches shows up in sys.dm_os_wait_stats, there are three broad categories: PAGEIOLATCH, PAGELATCH and LATCH

    The first two are latches used by the database to protect the integrity of the buffer pool. The PAGEIOLATCH variant is used when data is brought in from disk, while the PAGELATCH is used to coordinate things that are already in memory.
    The last latch type, LATCH, is a different beast that is used several places inside the SQL code path to coordinate other memory structures – for example during query execution. The good news is that you can get more details about this latch from the DMV: sys.dm_os_latch_stats. You can think of this view as a way to “zoom in” on latches.

The view contains a latch_class called BUFFER – this is just the aggregate of all the PAGEIOLATCH and PAGELATCH waits, you already have the detail of those, so just ignore it. The rest of the rows in sys.dm_os_latch_stats add up to the LATCH waits.

Spinlocks

Locks and latches have something in common: When a thread in SQL server needs to wait for them, the thread will block and yield the scheduler. In general, it is a good idea to play nice this way and yield the scheduler to another thread if you cannot get any more work done until you acquire access to the structures you want to use. SQL Server also implements user mode scheduling, which takes down the price of a context switch significantly – further boosting scale.

Think about locks and latches with this analogy: You are about to leave a plane, but your luggage is in an overhead compartment in the seats behind you. The best strategy for optimizing throughput, is for you to sit down, wait for the people in the seats behind you to pass and then go open the overhead compartment. This optimizes for the best system throughput and lets you turn off your brain just a little longer.

HOWEVER, sitting down and patiently waiting for other people to get their work done first, is not always the smartest idea. As you leave the plane and move to passport control, you don’t sit down on the floor while you wait for the person in front of you to fumble for his passport in his pocket. The reason you don’t do this, is that you expect him to quickly be done with his fetching; and you remain alert while you wait (and curse). It would be a waste of time and leg muscles to sit down, and the people behind you would get annoyed by you too.

Spinlocks are optimized to protect data structure where you expect the thread (or person) in front of you to get their work done fast! They are typically implemented as busy waiting loops that are optimized for the CPU architecture they run on (we want the loop to be as cheap as possible and we need some low level tricks to get it working). Wikipedia has an excellent article with x86 assembly examples.

SQL Server uses spinlocks in many parts of the code where the assumption of “shortly held locks” can be made, and typically this gives you more throughput and is very efficient.

You can see how many spins are done by querying this view:

SELECT * FROM sys.dm_os_spinlock_stats
ORDER BY backoffs DESC

 

In older version of SQL Server (before 2005) – this view was not available. However, there is another way to get the spinlock information, namely by executing:

DBCC SQLPERF ( spinlockstats )

 

The problem with spinlocks is of course that the assumption about shortly held locks is not always true. Sometimes, the guy in front of you just take too long to get stuff done, and you are stuck with all your attention pinned to him (spinning) and getting more and more annoyed. Interestingly, NUMA systems are typically the machines where this assumption breaks down, because accessing memory on a foreign node can take a significant amount of time. If access to that memory is protected by a spinlock, this means you can burn a lot of CPU in the busy waiting loop and you can get some very interesting side effects.

In general, you should not worry about high values in the spins column of sys.dm_os_spinlock_stats. Spinlocks are after all designed to spin a lot in busy waiting loops. The condition that may cause you to worry that the person in front of you is not clearing the queue fast enough, is when you see high values in the backoffs column. Backoffs are what happen when the thread spinning for a resource gives up and decided to fall asleep. If the backups go up dramatically as you increase throughput, you may have spinning on a memory region that is heavily contended and have to design around it.

There is an excellent article on how to diagnose spinlocks written by SQLCAT members Mike Ruthruff and Ewan Fairweather: Diagnosing and Resolving Spinlock Contention on SQL Server. Among other good stuff, it details how to use Trace Flag 3656 (we all love trace flags) and XEvents (and we love XEvents even more) to troubleshooting spinlock problems (see page 14-16).

Interlocked Operations

I am including this last synchronization primitive for completeness (read on to understand why).

Modern x86/x64 CPUs expose some very interesting instructions that allow you to build what is known as interlock operations. An interlock is a CPU special instruction, or series of CPU instructions, that are guaranteed to perform an atomic operation on an integer. For example, using an interlock you can safely add one to an integer and not have to worry that someone else is trying to add one at the same time. As another example, spinlocks (see above) are typically implemented using the compare and swap operation (CMPXCH in the x86 instruction set) that allow you to compare two values and swap them if they are equal, in a single, atomic instruction.

The reason I am only including this section for completeness, is that you have no way of telling if SQL Server used an interlock to synchronize access to a memory region (though it sometimes does) and no way to influence it. And the reason is that an interlock neither blocks, nor spins the thread – it simply allows the thread to execute as if no other threads were touching the data, because the interlock instruction “knows” that the underlying hardware handles the synchronization. It is the ultimate, lightweight synchronization trick.

You may wonder why SQL Server does not use interlocks everywhere. The big reason is that interlocks are severely limited. They can typically only operate on a a single integer at a time, and only support a small subset of operations on that integer. This is not enough to fully implement all the thread synchronization you need.

If you are interesting in reading more about interlock operations, there is a very nice introduction at this site for computer game programmers: Multithreaded Programming Part 3: Going Lockless. And by the way, I recommend studying the work of computer game programmers – they are the true wizards of computer science.

Related:

Advertisements
  1. May 7, 2013 at 16:49

    Thomas,

    My understanding that the reason for being for a spin lock is that its more efficient for a thread to ‘Spin’ and burn up CPU rather than cause a context switch. However, I also understand that threads should not cause context switches as they relinquish the scheduler which SQLOS uses to abstract away the CPU architecture. Therefore, given this, within the context of SQL Server, what level in the stack do the spin lock spins take place at ?

    Chris

  2. Thomas Kejser
    May 8, 2013 at 08:40

    Hi Chris

    Correct, spins “burn” the CPU through spinning on a memory region. There are a few different ways this can be done, but the basic idea is the same: You are in some form of busy waiting loop.

    You are correct that a “context switch” in SQL Server is not the usual “empty the stack and registers and load a new thread context” operation of the OS. It is MUCH cheaper than that. A thread that is blocked can simply go and do something else until the blocking condition is gone (compare this with switching to another thread if you use an OS level context switch). One of the reasons the original User Mode Scheduler was created was to bring down cost of thread coordination like this. However, a SQLOS context switch is still not free: You will blast away L2 cache lines and your register will also have to be reinitialised. Because of this, simply spinning can often be more efficient.

    Not sure I understand your question about the level of the stack.

  3. May 8, 2013 at 09:03

    My question regarding the level of the stack was regarding whether the thread yield takes place at the SQLOS scheduler or below this, I now know this is at the scheduler level. Paul Randal has a good post on the very subject here http://www.sqlskills.com/blogs/paul/sos_scheduler_yield-waits-and-the-lock_hash-spinlock/

    • Thomas Kejser
      May 8, 2013 at 09:07

      The yield is at the SQLOS level. But if a thread needs to “park” due to no more work, we are back to good old Windows signalling. But in that situation, no need to worry about scale.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s