SQL Server deadlock on the same table

sql-server tsql deadlock

10,232

Solution 1

After a little bit more searching and testing I am pretty confident I can give the correct answer to my own question.

I have to thank Martin Smith who put me in the right direction by pointing out that the wait resources were different.

As Martin wrote in his comment the wait resources are: 11:290100074:0 and 11:290100074:5. After searching this it turns out that if you run Sql Server R2 on a machine with 16 CPUs or more Sql Server is able to use a feature called lock partitioning.

This article says among other things:

Only NL, SCH-S, IS, IU, and IX lock modes are acquired on a single partition.

What happens in my case is that spid 155 puts a shared lock on a row or page and therefor puts an intended shared lock on the object and with the lock partition feature this happens to be on partition id 5.

At the same time spid 124 needs to lock the full object with an exclusive lock and therefor needs to put X lock on all partitions.

Shared (S), exclusive (X), and other locks in modes other than NL, SCH-S, IS, IU, and IX must be acquired on all partitions starting with partition ID 0 and following in partition ID order.

When it arrives at partition id 5 it is told that spid 155 holds an IS lock and it needs to wait until that lock is released.

Now when spid 124 is waiting on the IS lock to be released lock escalation occurs on spid 155 and it requests a shared lock on the table. This means it needs to put S lock on all partitions starting at id 0. But immediately on id 0 it hits the wall because spid 124 already holds an exclusive lock on that partition. And there you have the cause of the deadlock.

I can not guarantee 100% this is the exact answer but I am pretty sure I am, if not 100% right, at least close to the answer.

The solution? Well. The lock partition feature can not be turned off but on the other hand you can control lock escalation with different transaction levels and also different options in the alter table statement.

I will continue to investigate why the query forces lock escalation because I believe the solution in my particular case is to tune the query somehow to not escalate. At least I will try this before using the tools mentioned above.

Hope this answer helps other with similar problems.

Solution 2

It is not always true that "usual deadlock is when two ore more sessions hold locks on different resources and wait for each other" - also there are conversion deadlocks. Even if two processes compete on only one resource, they still can embrace in a conversion deadlock which I described here.

Also, although the best known deadlock scenario involves two connections modifying two tables in different order, there are also other deadlock scenarios involving only one table. Besides, in some scenarios each connection needs to issue only one statement, and it is enough to get a deadlock. Also in some scenarios only one connection needs to modify or acquire exclusive locks – the other one may only read data and only acquire shared locks and still embrace in a deadlock.

One more thing: answering this "none of the queries runs in a transaction" comment - every DML statement always runs in a transaction, and DML means selects too. All the commands involved in your deadlock run in the context of a transaction. Follow the second link, and run the repro scripts - you will see for yourself.

Anyway, I would just run the select under snapshot isolation - that would prevent this particular deadlock (when one connection only reads) from happening.

10,232

Author by

John

Updated on July 20, 2022

Comments

John almost 2 years

We have problems with deadlock situations in our application. I have read a lot about blocking, locking and deadlocks the last few days to try to get an understanding about the problem in order to solve it.

Now when I read the error log information about the deadlocks I can't understand how this situation can exist. Look at this (I have renamed the table names but the important one is the one called OurTable in the log message):

deadlock-list
deadlock victim=process1e2ac02c8
process-list
    process id=process1e2ac02c8 taskpriority=0 logused=0 waitresource=OBJECT: 11:290100074:0  waittime=704 ownerId=3144354890 transactionname=SELECT lasttranstarted=2011-12-01T14:43:20.577 XDES=0x80017920 lockMode=S schedulerid=6 kpid=7508 status=suspended spid=155 sbid=0 ecid=0 priority=0 trancount=0 lastbatchstarted=2011-12-01T14:43:20.577 lastbatchcompleted=2011-12-01T14:43:20.577 clientapp=.Net SqlClient Data Provider hostname=DE-1809 hostpid=5856 loginname=2Ezy isolationlevel=read committed (2) xactid=3144354890 currentdb=11 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128056
     executionStack
      frame procname=.dbo.RetrieveSomething line=23 stmtstart=1398 stmtend=3724 sqlhandle=0x03000b0030d42d645a63e6006a9f00000100000000000000
         select
            Col1
            ,Col2
            ,(
                SELECT TOP(1)
                    Col1
                FROM
                    OurTable2 AS C
                        JOIN OurTable AS ETC ON C.Id = ETC.FKId
                            AND E.Id = C.FKId
                ORDER BY ETC.Col2
            ) AS Col3
        from OurTable3 AS E
    process id=process2df4894c8 taskpriority=0 logused=0 waitresource=OBJECT: 11:290100074:0  waittime=9713 ownerId=3144330250 transactionname=INSERT EXEC lasttranstarted=2011-12-01T14:43:11.573 XDES=0x370764930 lockMode=S schedulerid=13 kpid=4408 status=suspended spid=153 sbid=0 ecid=0 priority=0 trancount=1 lastbatchstarted=2011-12-01T14:43:11.573 lastbatchcompleted=2011-12-01T14:43:11.573 clientapp=.Net SqlClient Data Provider hostname=DE-1809 hostpid=5856 loginname=2Ezy isolationlevel=read committed (2) xactid=3144330250 currentdb=11 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128056
     executionStack
      frame procname=adhoc line=1 sqlhandle=0x02000000ba6cb42612240bdb19f7303e279a714276c04344
         select
            Col1
            , Col2
            , Col3
            , ISNULL(
                (select top(1)
                    E_SUB.Col1 + ' ' + E_SUB.Col2
                    from OurTable3 as E_SUB 
                        inner join OurTable2 as C on E_SUB.Id = C.FKId
                        inner join OurTable as ETC on C.Id = ETC.FKId
                as Col3
        from OurTable4
            inner join dbo.OurTable as ETC on Id = ETC.FKId  
    process id=process8674c8 taskpriority=0 logused=0 waitresource=OBJECT: 11:290100074:5  waittime=338 ownerId=3143936820 transactionname=INSERT lasttranstarted=2011-12-01T14:38:24.423 XDES=0x1ecd229f0 lockMode=X schedulerid=7 kpid=12092 status=suspended spid=124 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2011-12-01T14:38:23.027 lastbatchcompleted=2011-12-01T14:38:23.013 clientapp=.Net SqlClient Data Provider hostname=DE-1809 hostpid=5856 loginname=2Ezy isolationlevel=read committed (2) xactid=3143936820 currentdb=11 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128056
     executionStack
      frame procname=.dbo.UpsertSomething line=332 stmtstart=27712 stmtend=31692 sqlhandle=0x03000b00bbf2a93c0f63a700759f00000100000000000000
            insert into dbo.OurTable
            (
                Col1
                ,Col2
                ,Col3
            )
            values
            (
                @Col1
                ,@Col2
                ,@Col3
            )
       resource-list
        objectlock lockPartition=0 objid=290100074 subresource=FULL dbid=11 objectname=dbo.OurTable id=lock16a1fde80 mode=X associatedObjectId=290100074
         owner-list
         waiter-list
          waiter id=process1e2ac02c8 mode=S requestType=wait
        objectlock lockPartition=0 objid=290100074 subresource=FULL dbid=11 objectname=dbo.OurTable id=lock16a1fde80 mode=X associatedObjectId=290100074
         owner-list
          owner id=process8674c8 mode=X
         waiter-list
          waiter id=process2df4894c8 mode=S requestType=wait
        objectlock lockPartition=5 objid=290100074 subresource=FULL dbid=11 objectname=dbo.OurTable id=lock212f0f300 mode=IS associatedObjectId=290100074
         owner-list
          owner id=process1e2ac02c8 mode=IS
         waiter-list
          waiter id=process8674c8 mode=X requestType=wait

The way I read this is:

spid 155 is waiting for a Shared table lock on OurTable (spid 124 holds a conflicting X lock)

spid 153 is waiting for a Shared table lock on OurTable (spid 124 holds a conflicting X lock)

spid 124 is waiting for an Exclusive table lock on OurTable (spid 155 holds a conflicting IS lock)

My question is how this can happen. Two sessions holds one lock on the whole table at the same time. I thought that a usual deadlock is when two ore more sessions hold locks on different resources and wait for each other. But here the lock is on the same resource. It is not a lock on an index but on the table. This error is frequent in our application and some lock has to be the first one to be requested and why is the second lock accepted if there already is a lock on the entire table?

Anyone who can give a hint of what can be wrong or anyone experienced a similar deadlock?

John over 12 years

Ok. But one thing I don't understand. When doing the insert (it is just a simple insert statement) the query does request an exclusive lock immediately, right? Or at least an update lock which it will convert to an exclusive one? In my case the insert has requested an exclusive lock and has got it granted. It should never has been granted if there were a granted shared lock the same time. Whether your lock design is bad or not, sql server should not grant two conflicting locks on the exact same resource. Why request write lock earlier, I don't need it until the insert..?
TomTom over 12 years

No, that toally depends on your isolation level. This is very finely tunable and someone on your end DID tune it (default is serializing rtransactions). Obviously whoever tuned it was not in sync with the developers and did it badly. It also can be the lock propagation that faily. An Upasert is not an insert - it is insert or update which requires a select for decision which may have left the read lock in place.
John over 12 years

I am not sure you are right here. The default isolation level is read committed, that is what I have read everywhere. Also, the statement is an insert statement. It decides whether to update or insert much earlier in the logic flow and it does not read from any table in order to decide that. This depends on a parameter of the procedure. Don't focus on the name of the procedure. Therefor I still don't understand. Also, why does it lock the whole table for inserting one row? This also seems odd to me.
John over 12 years

Is your example possible to reproduce with isolation level read committed? The reason you can have this issue is because your select statements in the example holds locks until the end of transaction because of level SERIALIZABLE. By the way only one of my statements is a modification statement (insert) and none of the queries runs in a transaction.