How to write SQL to query a data warehouse fact table

10,587

You have a poor fact table design, if you cannot get something as basic as the sum or the orders. Your so-called "OrderFact" is really a "LineItemFact". If it were an "order fact", a given order would only be on one row.

If this were designed as a LineItemFact, then the tax would be on a separate row and you could just do:

select sum(LineItemTotal)
from OrderFact

Absent that, you could have a "LineItemNumber to enumerate the rows for each order. Then you could do:

select sum(OrderTotal)
from OrderFact
where LineItemNumber = 1;

With your structure, you can do:

select sum(OrderTotal)
from (select OrderId, max(OrderTotal) as OrderTotal
      from OrderFact
      group by OrderId
     ) o
Share:
10,587
davidjhp
Author by

davidjhp

Updated on June 04, 2022

Comments

  • davidjhp
    davidjhp about 2 years

    A database model for an online seller:

    enter image description here

    To create the data warehouse, I flatten the tables per http://bit.ly/1bOuOXQ

    enter image description here

    Note data in fact table becomes repetitive (The buyers buy two items in their order.)

    enter image description here

    To get a sum of all revenue, I would normally write sql like this:

    select sum(OrderTotal) from OrderFact

    The sql would work correctly on the original OLTP tables, but now I am querying an OLAP data warehouse, which has denormalized data, and now that type of sql will produce incorrect results.

    The query for total revenue should return $152.60.

    How do I fix the sql to correctly query the fact table?