Create a Cumulative Sum Column in MySQL

125,252

Solution 1

If performance is an issue, you could use a MySQL variable:

set @csum := 0;
update YourTable
set cumulative_sum = (@csum := @csum + count)
order by id;

Alternatively, you could remove the cumulative_sum column and calculate it on each query:

set @csum := 0;
select id, count, (@csum := @csum + count) as cumulative_sum
from YourTable
order by id;

This calculates the running sum in a running way :)

Solution 2

Using a correlated query:


  SELECT t.id,
         t.count,
         (SELECT SUM(x.count)
            FROM TABLE x
           WHERE x.id <= t.id) AS cumulative_sum
    FROM TABLE t
ORDER BY t.id

Using MySQL variables:


  SELECT t.id,
         t.count,
         @running_total := @running_total + t.count AS cumulative_sum
    FROM TABLE t
    JOIN (SELECT @running_total := 0) r
ORDER BY t.id

Note:

  • The JOIN (SELECT @running_total := 0) r is a cross join, and allows for variable declaration without requiring a separate SET command.
  • The table alias, r, is required by MySQL for any subquery/derived table/inline view

Caveats:

  • MySQL specific; not portable to other databases
  • The ORDER BY is important; it ensures the order matches the OP and can have larger implications for more complicated variable usage (IE: psuedo ROW_NUMBER/RANK functionality, which MySQL lacks)

Solution 3

MySQL 8.0/MariaDB supports windowed SUM(col) OVER():

SELECT *, SUM(cnt) OVER(ORDER BY id) AS cumulative_sum
FROM tab;

Output:

┌─────┬──────┬────────────────┐
│ id  │ cnt  │ cumulative_sum │
├─────┼──────┼────────────────┤
│  1  │ 100  │            100 │
│  2  │  50  │            150 │
│  3  │  10  │            160 │
└─────┴──────┴────────────────┘

db<>fiddle

Solution 4

select Id, Count, @total := @total + Count as cumulative_sum
from YourTable, (Select @total := 0) as total ;

Solution 5

UPDATE t
SET cumulative_sum = (
 SELECT SUM(x.count)
 FROM t x
 WHERE x.id <= t.id
)
Share:
125,252

Related videos on Youtube

Kirk Ouimet
Author by

Kirk Ouimet

The internet is a miracle.

Updated on May 21, 2020

Comments

  • Kirk Ouimet
    Kirk Ouimet about 4 years

    I have a table that looks like this:

    id   count
    1    100
    2    50
    3    10
    

    I want to add a new column called cumulative_sum, so the table would look like this:

    id   count  cumulative_sum
    1    100    100
    2    50     150
    3    10     160
    

    Is there a MySQL update statement that can do this easily? What's the best way to accomplish this?

  • Matthew Flaschen
    Matthew Flaschen about 14 years
    Although the OP did ask for an update, this is denormalized and will probably be inconvenient to maintain correctly.
  • OMG Ponies
    OMG Ponies about 14 years
    Use a cross join to define the variable without needing to use SET.
  • Wacek
    Wacek about 14 years
    I would add "ORDER BY t.id ASC" to the main query, just to make sure it'll always work
  • Kirk Ouimet
    Kirk Ouimet about 14 years
    My table has 36 million records, so this was really helpful to speed things up!
  • Dercsár
    Dercsár about 14 years
    My first thought also was to add ORDER BY. But it does not matter. Until addition turns into non-associative, at least :)
  • Daniel Vassallo
    Daniel Vassallo about 14 years
    @OMG Poines: I think you need to use a SELECT in the JOIN (SELECT @running_total := 0) part of the variables example.
  • matt
    matt almost 12 years
    Note that ordering by cumulative_sum might force full table scan.
  • Rohit Gupta
    Rohit Gupta over 8 years
    Please explain your answer
  • allan.simon
    allan.simon almost 8 years
    for "using a correlated query" where does your table x come from ?
  • Marc L.
    Marc L. over 7 years
    Unless there is optimization happening internally, the correlated subquery is the equivalent of a triangular join performing in O(N^2) time--which will not scale.
  • Meglio
    Meglio over 7 years
    @MarcL. may you please explain what you mean?
  • Marc L.
    Marc L. over 7 years
    For every row within the table, a full query of the same table is being performed. The number of rows in the subquery increases linearly, up to the full count of the table (maybe -1). The total scope of this forms a big "triangle". Because of the repetition involved (it doesn't add to the prior result iteratively, but instead recreates for every row) this scales along with the area of the triangle. Scale doesn't respect linear multipliers, so the (N^2)/2 is just O(N^2).
  • raisercostin
    raisercostin over 7 years
    The answer works and is one liner. It also initializes/resets the variable to zero at the begining of select.
  • Ivan Peevski
    Ivan Peevski about 7 years
    Doesn't seem to be working, @running_total seems to be 0 for every row. Could that be a version thing? (running MariaDB)
  • zaitsman
    zaitsman almost 7 years
    This does work and seems quite fast; any suggestions how this can be extended to do a cumulative sum in a group? e.g. group by Name or similar, and then do a cumulative sum only for records with the same name
  • Pascal
    Pascal almost 6 years
    @zaitsman You can use it as a subquery; on the outer query, group by anything you want, and then use the MAX() MySQL function to get the correct cumulative summary (the last summary) that was calculated for the records inside the group.
  • Brick
    Brick almost 6 years
    This does not seem to work with an order clause that differs from the order in which the records are encountered by MySQL. It seems that the cumulative sum is computed in the original order and then scrambled when the ordering is done, making it no longer "cumulative" in the desired sense.
  • Yuki Inoue
    Yuki Inoue almost 6 years
    Prefer answer of OLAP function in MySQL 8.0+, as stated in stackoverflow.com/a/52278657/3090068
  • Tiw
    Tiw over 5 years
    While this might solve the problem, it's better to explain it a bit so it will benefit others :)
  • Raymond Nijland
    Raymond Nijland over 5 years
    this isn't a co-related subquery or a subquery for that matter... co-related subquery follows SELECT ...., (SELECT .... FROM table2 WHERE table2.id = table1.id ) FROM table1 what you have is a window query..
  • DatabaseCoder
    DatabaseCoder almost 5 years
    I am looking for Cumulative sum using windows function.Thank you.
  • Flavio_cava
    Flavio_cava about 4 years
    Added some step by step to understand the final query
  • Evhz
    Evhz over 3 years
    if you want your comulative count from a table, then table x is the same as table t (you count the total number of elements previous to the current, so you iterate the source table per each row in the source table)