Count Number of Consecutive Occurrence of values in Table

sql sql-server tsql sql-server-2012 aggregation

40,455

Solution 1

One approach is the difference of row numbers:

select name, count(*) 
from (select t.*,
             (row_number() over (order by id) -
              row_number() over (partition by name order by id)
             ) as grp
      from t
     ) t
group by grp, name;

The logic is easiest to understand if you run the subquery and look at the values of each row number separately and then look at the difference.

Solution 2

You could use windowed functions like LAG and running total:

WITH cte AS (
 SELECT Id, Name, grp = SUM(CASE WHEN Name = prev THEN 0 ELSE 1 END) OVER(ORDER BY id)
 FROM (SELECT *, prev = LAG(Name) OVER(ORDER BY id) FROM t) s
)
SELECT name, cnt = COUNT(*)
FROM cte
GROUP BY grp,name
ORDER BY grp;

db<>fiddle demo

The first cte returns group number:

+-----+-------+-----+
| Id  | Name  | grp |
+-----+-------+-----+
|  1  | A     |   1 |
|  2  | A     |   1 |
|  3  | B     |   2 |
|  4  | B     |   2 |
|  5  | B     |   2 |
|  6  | B     |   2 |
|  7  | C     |   3 |
|  8  | B     |   4 |
|  9  | B     |   4 |
+-----+-------+-----+

And main query groups it based on grp column calculated earlier:

+-------+-----+
| name  | cnt |
+-------+-----+
| A     |   2 |
| B     |   4 |
| C     |   1 |
| B     |   2 |
+-------+-----+

Solution 3

I have use Recursive CTE and minimise the use of row_number,also avoid count(*).

I think it will perform better,but in real world it depend what else filter you put to minimise number of rows affected.

If ID is having discreet values then One extra CTE will be use to generate continuous id.

 ;With CTE2 as
(
select ROW_NUMBER()over(order by id) id, name,1 Repetition ,1 Marker  from @t
)
, CTE as
(
select top 1 cast(id as int) id, name,1 Repetition ,1 Marker  from CTE2 order by id

union all

select a.id, a.name
, case when a.name=c.name then Repetition +1 else 1 end  
, case when a.name=c.name then c.Marker else  Marker+1 end
from @t a
inner join CTE c on a.id=c.id+1

)
,CTE1 as
(select *,ROW_NUMBER()over(partition by marker order by id desc)rn from cte c
)
select Name,Repetition from cte1 where rn=1

40,455

Author by

FLICKER

Software Engineer Since 2000 OOP/OOD, C#, Delphi, MVC, CSS, HTML, JavaScript SQL Server, Microsoft BI (SSIS, SSAS, SSRS), Web Development, SharePoint, Cognos

Updated on July 09, 2022

Comments

FLICKER almost 2 years

I have below table

create table #t (Id int, Name char)

insert into #t values
(1, 'A'),
(2, 'A'),
(3, 'B'),
(4, 'B'),
(5, 'B'),
(6, 'B'),
(7, 'C'),
(8, 'B'),
(9, 'B')

I want to count consecutive values in name column

+------+------------+
| Name | Repetition |
+------+------------+
| A    |          2 |
| B    |          4 |
| C    |          1 |
| B    |          2 |
+------+------------+

The best thing I tried is:

select Name
, COUNT(*) over (partition by Name order by Id) AS Repetition
from #t
order by Id

but it doesn't give me expected result

King King about 8 years

With order by max(id) added, the result would look exactly like in the OP's post (involving the order of the records).
FLICKER about 8 years

@KingKing, thanks for your hint. Yes, it makes it exactly like what I need.
FLICKER about 8 years

Why my query doesn't give me the result, I expect when I partition by Name, the count(*) should reset when the values of Name changes. I appreciate if you explain that. Thanks again.
Gordon Linoff about 8 years

@FLICKER . . . I think if you run your query, you should understand. It enumerates the names regardless of gaps between rows. That is, it ignores gaps.
AhmadReza almost 4 years

This solution doesn't work for this example: A, A, B, B, B, A, B
Gordon Linoff almost 4 years

@AhmadRezaSaboor . . . It certainly should. If you have a question, though, you should ask it as a new question.
AhmadReza almost 4 years

@GordonLinoff I don't think a new question is required. Just calculate the grp values for my example and you'll understand where the issue is. The grp values are: 0, 0, 2, 2, 2, 3, 3.
Alex M about 3 years

@AhmadRezaSaboor Would wrapping the query in another one and then partitioning by id and grp not fix this issue? Something like select *,count(1) over (partition by id,grp) as correct_consec from ...
Raj G over 2 years

@GordonLinoff Thanks
vijayraj34 over 2 years

This is the best solution. How can we do this using self join? Someone asked me this question in an interview.
cautionbug over 2 years

@vijayraj34 i don't think self-join would give you this particular result (consecutive occurrences of any value). Self-join CAN give you the extremity (max/min, first/last, etc.) record of each record group: stackoverflow.com/a/8749095/258598
dz902 about 2 years

@AhmadReza I was confused at some point as well. Your group values calculations were correct, but note that the final, outermost query would group by both group value and name. So it would work.