Tuesday, July 16, 2013

0 THREADPOOL and SQL Server threads

Just like most of the Windows software, SQL Server is operating under thread model. This means the SQL Server process would spawn child threads to execute the work.

 

View the SQL Server thread

We can use sysinternal tools pstools or procexp to see the current SQL Server threads.

This is thread tab under the procexp

image

pslist.exe  sqlservr -d

image

Above is the screenshot of the output from pslist, it is about 49 threads under the SQL Server processes. However not all threads are “SQL Server worker threads” which are the actual threads that can work on the user requests.  We can use below query to check the internal sql server threads.

select COUNT(*) from sys.dm_os_workers

image

SQL Server would add the number of the threads when the workload is increasing .

The max number of the worker threads can be configured through “max worker threads” , the max_workers_count is the automatic max value base on the OS and CPU count for the SQL Servers.

In my test env, my current worker threads is 41 ( from dm_os_worker)  and max work threads is 512 . The minimum value from sp_configure does not really mean the minimum of the current value, it just mean the minimum value we can set for it. The current worker threads would increase over time till the max_workers_counts.

sp_configure "max worker threads"
select max_workers_count from sys.dm_os_sys_info;

image

Microsoft guild line for how to configure the max workthread can be found here : http://msdn.microsoft.com/en-us/library/ms190219.aspx

 

THREADPOOL exhaustion

Every user request would be assign to a work thread to carry out the work. If the request is in the pending state or waiting for other resources, the work thread still need to tied to the session until the resource become available. Since there are only limited number of the work threads, if all work threads has been used up, the SQL server would basically hang and can not accept new user connection. We will also see the error message like below:

2013-07-15 21:17:39.59 Server      New queries assigned to process on Node 0 have not been picked up by a worker thread in the last 60 seconds. Blocking or long-running queries can contribute to this condition, and may degrade client response time. Use the "max worker threads" configuration option to increase number of allowable threads, or optimize current running queries. SQL Process Utilization: 0%. System Idle: 75%.

Here are the steps to reproduce the issue:

Create the test env

use master
CREATE DATABASE [THREADPOOL_DEMO] ON  PRIMARY 
( NAME = N'THREADPOOL_DATA', FILENAME = N'C:\DATA\THREADPOOL_DATA.mdf' , SIZE = 4096KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
 LOG ON 
( NAME = N'THREADPOOL_LOG', FILENAME = N'C:\DATA\THREADPOOL_LOG.ldf' , SIZE = 2048KB  , FILEGROWTH = 10%);
 
use [THREADPOOL_DEMO];
create table t1 (c1 int,c2 varchar(50));
insert into  t1 values ( 1,'summer');
insert into  t1 values ( 2,'summer');

 

Run the DML to update the record but keep the transaction open

begin transaction ;
use [THREADPOOL_DEMO];
update t1 set c2='sunny' where c1=2;

Simulate 600 users connection to query the same table

ostress.exe -E -S. -Q"select * from THREADPOOL_DEMO.dbo.t1" -n600

SQL Server needs to hold exclusive lock on the t1 table and select would need to acquire the share lock. But share lock is not compatible with exclusive lock therefore all the query would be block and all the worker threads has been used out. No new connections can be accept to the sql server . The sql server becomes hang .  The existing connection would also hang because once the connection is idle, the worker threads would be assigned to other sessions hangs no worker threads would be available to execute the requests. 

The plists would also show 500+ threads

image

 

Diagnostic  

The only way to get into the SQL Server to do some troubleshooting is via DAC. ( sqlcmd –E –A )

Since SQL Server 2005, SQL Server reserve the dedicate scheduler for DAC .

select * from sys.dm_os_schedulers

image

-- show number of worker threads
select COUNT(*) from sys.dm_os_workers
-- show the THREADPOOL wait time
select * from sys.dm_os_wait_stats where wait_type = 'THREADPOOL'
-- most of the sessions are waiting for LCK_M_S
select * from sys.dm_os_waiting_tasks 

image

identify the head block process

select top 1 session_id,blocking_session_id from sys.dm_os_waiting_tasks where session_id > 50
order by session_id 

image

Find the detail of the processes

select dec.session_id,des.login_name,des.login_time,des.host_name,des.program_name  ,TEXT from sys.dm_exec_connections dec
CROSS APPLY sys.dm_exec_sql_text(most_recent_sql_handle) inner join sys.dm_exec_sessions des on
dec.session_id = des.session_id
where dec.session_id > 50 and dec.session_id = 52 
order by dec.session_id

image

Solution 1 : Kill the blocking spid

We can kill the head blocking spid from the DAC. Once the spid is gone. the other processes can acquire the share locks.

Solution 2 : Increase the max worker threads

“max worker threads”  is RR in 2008R2, so increase it would require SQL server bounce to make it effective. Base on Book on line, it seems to be dynamic in 2012. so it may be useful as the alternative than kill.

 

SOLUTION 3: set the database to READ_COMMITTED_SNAPSHOT

If you already know your application is unavoidable to have long transaction, then turn on the RCSI is the good alternative.  This needs to be done prior the transaction as it also require to obtain the lock on the database.

ALTER DATABASE THREADPOOL_DEMO SET READ_COMMITTED_SNAPSHOT ON

 

Reference

http://msdn.microsoft.com/en-us/library/ms190219.aspx

0 comments:

Post a Comment

 

SQL Panda Copyright © 2011 - |- Template created by O Pregador - |- Powered by Blogger Templates