SQL Server doesn't use index in stored procedure

12,150

Solution 1

[EDIT]

The PERSISTED-not-being-used issue below occurs only with actieGroep/actieId on my system (SQL 2008). But it's possible that the same problem could be happening on your SQL 2005 system with the dagnummer/datum columns as well. If indeed that's happening, it would explain the behavior you're seeing, since a clustered index scan would be required to filter for values of datum. To diagnose whether this is indeed happening, simply add the datum column as an INCLUDE-d column to your index, like this:

CREATE NONCLUSTERED INDEX [IX_STAT_Statistieken_1] ON [dbo].[STAT_Statistieken]  
(  
    [foreignId] DESC,  
    [dagnummer] DESC,  
    [actieId] ASC,   
    [aantal] ASC    
) INCLUDE (datum)  ON [PRIMARY]

If the problem goes away with this index revision, then you know that dagnummer is the issue-- you can probably even remove dagnummer from the index since SQL isn't using it anyways.

Also, revising your index to add actieId is a good idea since it evades the issue noted below. But in the process you also need to leave the aantal column in the index, so that your index will be a covering index for this query. Otherwise SQL will have to read your clustered index to get the value of that column. This will slow down your query since lookups into the clustered index are quite slow.

[END EDIT]

Here's a bunch of ideas which may help you fix this, with most likely/easiest things first:

  • When I tried to repro your using schema and queries (with fake generated data), I see that your PERSISTED computed column actieGroep is re-copmputed at runtime instead of the persisted value being used. This looks like a bug in the SQL Server optimizer. Since the underlying column value actieGroep is not present in your covering index IX_STAT_Statistieken_1 index (only the computed column is there), if SQL Server decides that it needs to fetch that additional column, SQL may consider a clustered index to be cheaper than using your non-clustered index and then looking up actieId for each matching row in the cluster index. This is because clustered index lookups are very expensive relative to sequential I/O, so any plan which requires more than a few percent of rows to be looked up is probably cheaper to do with a scan. In any case, if this is indeed the problem you're seeing, then adding actieGroep as an INCLUDE-d column of your IX_STAT_Statistieken_1 index should work around the issue. Like this:

    CREATE NONCLUSTERED INDEX [IX_STAT_Statistieken_1] ON [dbo].[STAT_Statistieken]
    (
    [foreignId] DESC,
    [secondaryId] ASC,
    [actieGroep] ASC,
    [dagnummer] DESC,
    [aantal] ASC
    ) INCLUDE (actieId) ON [PRIMARY]

  • the data type of the computed column actieGroep is a string but you're comparing it to integers (e.g. IN (1,2,3)) in your WHERE clause and CASE statements. If SQL decides to convert the column instead of the constant, it will hurt query perf and may make the computed-column-expansion problem (described above) more likely. I'd strongly suggest changing your computed column definition to an integral type, e.g.

    CASE WHEN actieId BETWEEN 0 AND 9 THEN actieId
    WHEN actieId BETWEEN 10 AND 99 THEN actieId/10
    WHEN actieId BETWEEN 100 AND 999 THEN actieId/100
    WHEN actieId BETWEEN 1000 AND 9999 THEN actieId/1000
    WHEN actieId BETWEEN 10000 AND 99999 THEN actieId/10000
    WHEN actieId BETWEEN 100000 AND 999999 THEN actieId/100000
    WHEN actieId BETWEEN 1000000 AND 9999999 THEN actieId/1000000
    ELSE actieId/10000000 END

  • you're doing a GROUP BY a column which only has one possible value. Therefore, the GROUP BY is unnecessary. Hopefully the optimizer would be smart enough to know this, but you can never be sure.

  • Try using an OPTIMIZE FOR hint instead of directly forcing indexes, that may work around the error you get with your hint

  • Craig Freedman's post http://blogs.msdn.com/craigfr/archive/2009/04/28/implied-predicates-and-query-hints.aspx which describes common causes of the hint-related error message that you're getting when RECOMPILE is used. You may want to review that post and make sure you're running the latest updates to SQL Server.

  • I'm sure you've already done this, but you may want to build a "clean room" version of your data, by doing what we're doing: creating a new DB, use the DDL in your question to create the tables, and then populating the tables with data. If the results you get are different, look closley at the schema in your real table and indexes, and see if they're different.

If none of this works, comment and I can suggest some more wild ideas. :-)

Also, please add the exact version and update level of SQL Server to your question!

Solution 2

What data type is foreignId in the table? If it's int then you're likely getting an implicit conversion which prevents index seeks. If the data type in the table is int then redefine the parameter to be int as well and you should get an index seek (not an index scan) for this query.

Solution 3

it could be parameter sniffing, so try something like this:

ALTER PROCEDURE MyProcedure (@fid BIGINT)
AS BEGIN
    DECLARE @fid_sniff BIGINT
    SET @fid_sniff=@fid
    SELECT foreignId
    FROM STAT_Statistieken
    WHERE foreignId = @fid_sniff
END

read more anout parameter sniffing: http://omnibuzz-sql.blogspot.com/2006/11/parameter-sniffing-stored-procedures.html

Solution 4

First, I should say that the indexes you have created are not optimal, since they can only be used to filter on foreignId.

SQL Server is not able of doing SKIP SCAN and you have a secondaryId in your index which is not being filtered with a range condition.

Therefore your condition on foreignId, actieGroep, dagNummer does not yield a limited number of ranges and is not completely sargable. It can filter only on foreignID, not on the whole set.

Now, to your current index.

I just created your tables and filled them with the random data using this script:

DROP TABLE STAT_Statistieken

CREATE TABLE [dbo].[STAT_Statistieken](
    [statistiekId] [bigint] IDENTITY(1,1) NOT NULL,
    [foreignId] [bigint] NOT NULL,
    [datum] [datetime] NOT NULL, --date
    [websiteId] [int] NOT NULL,
    [actieId] [int] NOT NULL, --actionId
    [objectSoortId] [int] NOT NULL, --kindOfObjectId
    [aantal] [bigint] NOT NULL, --count
    [secondaryId] [int] NOT NULL DEFAULT ((0)),
    [dagnummer]  AS (datediff(day,CONVERT([datetime],'2009-01-01 00:00:00.000',(121)),[datum])) PERSISTED, --daynumber
    [actieGroep]  AS (substring(CONVERT([varchar](4),[actieId],0),(1),(1))) PERSISTED,
    CONSTRAINT [STAT_Statistieken_PK] PRIMARY KEY CLUSTERED --actionGroup
    (
        [statistiekId] ASC
    )WITH (PAD_INDEX  = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
    ) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [IX_STAT_Statistieken_1] ON [dbo].[STAT_Statistieken] 
(
    [foreignId] DESC,
    [secondaryId] ASC,
    [actieGroep] ASC,
    [dagnummer] DESC, 
    [aantal] ASC --count
)WITH (PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF,  ONLINE = OFF) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [IX_STAT_Statistieken_2] ON [dbo].[STAT_Statistieken] 
(
    [foreignId] DESC,
    [secondaryId] ASC,
    [actieId] ASC,
    [dagnummer] DESC,
    [aantal] ASC -- count
)WITH (PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

;WITH    nums AS
        (
        SELECT  1 AS num
        UNION ALL
        SELECT  num + 1
        FROM    nums
        )
INSERT
INTO    STAT_Statistieken (
        [foreignId], [datum], [websiteId], [actieId],
        [objectSoortId], [aantal])
SELECT  TOP 100000
        500, GETDATE(), num, num, num, num % 5
FROM    nums
UNION ALL
SELECT  TOP 100000
        num % 1000, GETDATE(), num, num, num, num % 5
FROM    nums
OPTION (MAXRECURSION 0)

UPDATE STATISTICS STAT_Statistieken

, and it uses INDEX SEEK no matter what, which most probably means that the problem is with your data distribution.

I'd recommend you to create an additional index with secondaryID removed, like this:

CREATE NONCLUSTERED INDEX [IX_STAT_Statistieken_3] ON [dbo].[STAT_Statistieken] 
(
    [foreignId] DESC,
    [actieGroep] ASC,
    [dagnummer] DESC, 
    [aantal] ASC --count
)

IF you still want to use your current index, could you please run these commands:

DBCC SHOW_STATISTICS ('STAT_Statistieken', 'IX_STAT_Statistieken_1')
DBCC SHOW_STATISTICS ('STAT_Statistieken', 'IX_STAT_Statistieken_2')

Each command will output three resultsets.

Could you please post resultsets 1 and 2 from each command, and three rows from resultset 3 with value of RANGE_HI just above, just below and equal to 873926?

Solution 5

I've seen similar behavior before, where it would actually take the index hint and do something worse with it (unfiltered index scan with bookmark lookup).

One of these four should help:

1) Append ;-T4102;-T4118 to SQL Server 2005 startup parameters (might apply to SQL 2008). Note: this brings back the SQL 2000 bad handling of IN and NOT IN queries in SQL 2005.

2) UPDATE STATISTICS [dbo].[STAT_Statistieken] WITH FULLSCAN

3) OPTION (MAXDOP 1) -- sometimes the parallelism causes really stupid queries to be generated

4) Ensure the index is online.

Also note that if you are creating an index on a table created in a stored procedure, that index does not exist when compiling the stored procedure queries so it will not be used. Since your table is created globally in dbo I assume that is NOT the case here.

EDIT: sometimes I wish there was a true forceplan where you could key in the plan directly and any possible plan will be executed: sort of an assembly-like language for the DB.

Share:
12,150
Jan Jongboom
Author by

Jan Jongboom

Updated on June 03, 2022

Comments

  • Jan Jongboom
    Jan Jongboom about 2 years

    I haven't solved this issue by using the stored procedure, yet we've decided to surpass the SP and just execute the plain ol' SQL

    Please see the extended table scheme below
    Edit 2: Updated the index (to not use actieGroep anymore)
    NB. SQL Server 2005 Enterprise 9.00.4035.00
    NB2. Seems related to http://www.sqlservercentral.com/Forums/Topic781451-338-1.aspx

    I've got two indices on a table:

    • A clustered PK index on statistiekId
    • An non-clustered index on foreignId

    And I have the following piece of code:

    DECLARE @fid BIGINT
    SET @fid = 873926
    
    SELECT foreignId
    FROM STAT_Statistieken
    WHERE foreignId = @fid
    

    This executes just the way it should; it points to the correct index, and all it does is scanning the index.

    Now I am creating a stored procedure:

    ALTER PROCEDURE MyProcedure (@fid BIGINT)
    AS BEGIN
        SELECT foreignId
        FROM STAT_Statistieken
        WHERE foreignId = @fid
    END
    

    Running the thing:

    EXEC MyProcedure @fid = 873926
    

    Now it's running a clustered index scan on my PK index! Wtf is going on?

    So I changed the SP to

    SELECT foreignId
    FROM STAT_Statistieken
        WITH (INDEX(IX_STAT_Statistieken_2))
    WHERE foreignId = @fid
    

    And now it gives: Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN. While the same function is running just like it should when executing this directly.


    Extra info: full scheme which can reproduce this behaviour (English names in commentary)

    Table

    CREATE TABLE [dbo].[STAT_Statistieken](
        [statistiekId] [bigint] IDENTITY(1,1) NOT NULL,
        [foreignId] [bigint] NOT NULL,
        [datum] [datetime] NOT NULL, --date
        [websiteId] [int] NOT NULL,
        [actieId] [int] NOT NULL, --actionId
        [objectSoortId] [int] NOT NULL, --kindOfObjectId
        [aantal] [bigint] NOT NULL, --count
        [secondaryId] [int] NOT NULL DEFAULT ((0)),
        [dagnummer]  AS (datediff(day,CONVERT([datetime],'2009-01-01 00:00:00.000',(121)),[datum])) PERSISTED, --daynumber
        [actieGroep]  AS (substring(CONVERT([varchar](4),[actieId],0),(1),(1))) PERSISTED,
        CONSTRAINT [STAT_Statistieken_PK] PRIMARY KEY CLUSTERED --actionGroup
        (
            [statistiekId] ASC
        )WITH (PAD_INDEX  = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
        ) ON [PRIMARY]
    

    Index

    CREATE NONCLUSTERED INDEX [IX_STAT_Statistieken_foreignId_dagnummer_actieId_secondaryId] ON [dbo].[STAT_Statistieken] 
    (
        [foreignId] ASC,
        [dagnummer] ASC,
        [actieId] ASC,
        [secondaryId] ASC
    )WITH (PAD_INDEX  = ON, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, FILLFACTOR = 80, ONLINE = OFF) ON [PRIMARY]
    

    Execution

    SET NOCOUNT ON;
    
        DECLARE @maand INT, @jaar INT, @foreignId BIGINT
        SET @maand = 9
        SET @jaar = 2009
        SET @foreignId = 828319
    
    
    DECLARE @startDate datetime, @endDate datetime
    SET @startDate = DATEADD(month, -1, CONVERT(datetime,CAST(@maand AS varchar(3))+'-01-'+CAST(@jaar AS varchar(5))))
    SET @endDate = DATEADD(month, 1, CONVERT(datetime,CAST(@maand AS varchar(3))+'-01-'+CAST(@jaar AS varchar(5))))
    
    DECLARE @firstDayDezeMaand datetime
    SET @firstDayDezeMaand = CONVERT(datetime, CAST(@jaar AS VARCHAR(4)) + '/' + CAST(@maand AS VARCHAR(2)) + '/1')
    
    DECLARE @daynumberFirst int
    set @daynumberFirst = DATEDIFF(day, '2009/01/01', @firstDayDezeMaand)
    
    DECLARE @startDiff int
    SET @startDiff = DATEDIFF(day, '2009/01/01', @startDate)
    
    DECLARE @endDiff int
    SET @endDiff = DATEDIFF(day, '2009/01/01', @endDate)
    
    SELECT @foreignId AS foreignId,
        SUM(CASE WHEN dagnummer >= @daynumberFirst THEN (CASE WHEN actieId BETWEEN 100 AND 199 THEN aantal ELSE 0 END) ELSE 0 END) as aantalGevonden, 
        SUM(CASE WHEN dagnummer >= @daynumberFirst THEN (CASE WHEN actieId BETWEEN 200 AND 299 THEN aantal ELSE 0 END) ELSE 0 END) as aantalBekeken, 
        SUM(CASE WHEN dagnummer >= @daynumberFirst THEN (CASE WHEN actieId BETWEEN 300 AND 399 THEN aantal ELSE 0 END) ELSE 0 END) as aantalContact,
        SUM(CASE WHEN dagnummer < @daynumberFirst THEN (CASE WHEN actieId BETWEEN 100 AND 199 THEN aantal ELSE 0 END) ELSE 0 END) as aantalGevondenVorige, 
        SUM(CASE WHEN dagnummer < @daynumberFirst THEN (CASE WHEN actieId BETWEEN 200 AND 299 THEN aantal ELSE 0 END) ELSE 0 END) as aantalBekekenVorige, 
        SUM(CASE WHEN dagnummer < @daynumberFirst THEN (CASE WHEN actieId BETWEEN 300 AND 399 THEN aantal ELSE 0 END) ELSE 0 END) as aantalContactVorige
    FROM STAT_Statistieken
    WHERE
        dagnummer >= @startDiff
        AND dagnummer < @endDiff
        AND foreignId = @foreignId 
    OPTION(OPTIMIZE FOR (@foreignId = 837334, @startDiff = 200, @endDiff = 300))
    

    DBCC Statistics

    Name                                                          | Updated               | Rows      | Rows smpl | Steps | Density | Avg. key | String index
    IX_STAT_Statistieken_foreignId_dagnummer_actieId_secondaryId    Oct  6 2009  3:46PM 1245058    1245058    92    0,2492834    28    NO
    
    All Density  | Avg. Length | Columns
    3,227035E-06    8    foreignId
    2,905271E-06    12    foreignId, dagnummer
    2,623274E-06    16    foreignId, dagnummer, actieId
    2,623205E-06    20    foreignId, dagnummer, actieId, secondaryId
    8,031755E-07    28    foreignId, dagnummer, actieId, secondaryId, statistiekId
    
    RANGE HI | RANGE_ROWS | EQ_ROWS | DISTINCT_RANGE_ROWS | AVG_RANGE ROWS
    -1         0            2         0                     1
    1356       3563         38        1297                  2,747109
    8455       14300        29        6761                  2,115072
    

    And the index is used as shown in the execution plan. When I wrap this up in a procedure with this params:

    @foreignId bigint,
    @maand int, --month
    @jaar int --year
    

    And run it with _SP_TEMP @foreignId = 873924, @maand = 9, @jaar = 2009

    It does a clustered index scan!

  • Jan Jongboom
    Jan Jongboom almost 15 years
    Yes I can actually run the query, with the hint, outside of the stored procedure. And even with 'WITH RECOMPILE' I cannot get the hint to work. Same error.
  • John Sansom
    John Sansom almost 15 years
    RECOMPILE will not address the issue with your query hint. Recompiling a stored procedure, on each execution, will ensure that a new execution plan is generated that is tuned to the specific parameters provided for the given execution.
  • João Paladini
    João Paladini almost 15 years
    Great idea Gail, I hadn't thought of that. :-)
  • GilaMonster
    GilaMonster almost 15 years
    It's the only thing I can think of that would make SQL scan an index when the index is covering and with an equality predicate directly on the index key. Shouldn't be parameter sniffing, the exec plan for a query that simple is the same regardless of the number of rows as there's no need for bookmark lookups.
  • João Paladini
    João Paladini almost 15 years
    Yeah, it didn't seem like parameter-sniffing to me either.
  • Jan Jongboom
    Jan Jongboom over 14 years
    foreignId is a bigint, I saw this in an earlier version of the procedure, and changed it back then.
  • Jan Jongboom
    Jan Jongboom over 14 years
    Didn't help, tho I think RECOMPILE would do the same.
  • GilaMonster
    GilaMonster over 14 years
    You know, the full schema is a hell of a lot more complex than you initially described it...
  • vpalmu
    vpalmu over 14 years
    A lookup on the non-clustered would still beat a scan of the clustered.
  • Gabriel Guimarães
    Gabriel Guimarães over 14 years
    hmmm nope, non-clustered index have the clustered index key on their leaf level, that means if you do a select with a index that doesn't cover the select (called cover index) the select will go for the non clustered index and after that will look for other fields in the clustered index. (you index doesn't cover your select since the actieGroep field is not included in the index)
  • Jan Jongboom
    Jan Jongboom over 14 years
    Updated the statistics, added the option to the stored procedure, but still a full table scan; index is online. Outside SP the query is performed well (and way faster).
  • Jan Jongboom
    Jan Jongboom over 14 years
    Ok. I removed the actieGroep statements from the queries, using only actieId, which is a true int field. I removed the GROUP BY, which was indeed useless, added OPTIMIZE FOR hints for the foreignId, and daynumbers. No help. SQL Server version is SQL Server 2005 Enterprise 9.00.4035.00
  • Jan Jongboom
    Jan Jongboom over 14 years
    I don't get any result from the above query (on master I assume?); just no rows whatsoever.
  • Jan Jongboom
    Jan Jongboom over 14 years
    Well both the non-SP and the SP use the same values, and I added an OPTIMIZE FOR hint in the stored procedure with an additional hint to use that foreignId.
  • Jan Jongboom
    Jan Jongboom over 14 years
    And the number of rows returned is about 10 out of 1,500,000
  • Jan Jongboom
    Jan Jongboom over 14 years
    I changed my index and Stored Procedure to not use the actieGroep anymore, but the problem still stays.
  • vpalmu
    vpalmu over 14 years
    It appears that your WITH INDEX doesn't match the name of the index. I'm starting to think that wasn't a copy-paste error into stackoverflow.
  • Gabriel Guimarães
    Gabriel Guimarães over 14 years
    The table doesn't exist on master, use the query on your database
  • Jan Jongboom
    Jan Jongboom over 14 years
    Nah, changed the index and the name to solve another problem we had. Yet this doesn't change a thing. I copied the whole contents of the stored procedure to a new window and I can execute with using the index and having exactly the same information.
  • Justin Grant
    Justin Grant over 14 years
    Thanks for the additional info. I edited my answer to propose another possible solution which matches exactly the behavior you're seeing. See top of the answer above. Good luck solving this... it's an interesting issue!
  • Jan Jongboom
    Jan Jongboom over 14 years
    Ok. I couldn't solve this issue, yet granting you the bounty for coming up with the most viable sources :-) I ended up just surpassing the stored procedure, and execute the sql myself.
  • Justin Grant
    Justin Grant over 14 years
    Thanks! This one was truly a stumper. I'm still curious about what was actually going on. Did you have a chance to try INCLUDE-ing the datum column in your index? OK if not, just wondering. Anyway, glad you were able to find a workaround.
  • vpalmu
    vpalmu over 14 years
    Parameter sniffing can be effectively turned off by WITH RECOMPILE at the end of the calling SQL.
  • usr
    usr over 12 years
    The Microsoft Connect item about the skip scan: connect.microsoft.com/SQLServer/feedback/details/695044/…