SQL: grouping 2 tables as 1 with join, union, and then?

21,587

Solution 1

You should probably create views over the two tables:

CREATE VIEW All_Orders
AS
     SELECT
          id,
          id_customer,
          code,
          date,
          'H' AS order_type
     FROM
          h_orders
     UNION ALL
     SELECT
          id,
          id_customer,
          code,
          date,
          'P' AS order_type
     FROM
          p_orders

CREATE VIEW All_Order_Items  -- A table name of "items" is pretty bad in my opinion
AS
     SELECT
          id,
          id_order,
          description,
          price,
          'H' AS order_item_type
     FROM
          h_items
     UNION ALL
     SELECT
          id,
          id_order,
          description,
          price,
          'P' AS order_item_type
     FROM
          p_items

Now you can just join to those views. I included the types (P & H) so that you know what the "id" column now refers to. If the ids in your two tables ("h" and "p" can have duplicates then you will have to join the Orders table right in the All_Order_Items view. Otherwise you will have a lot of trouble joining between the two views. Hopefully your id columns are intelligently designed and not just auto-incrmenting or identity columns.

Solution 2

You could try this:

SELECT tbl.ID, 
       tbl.Name, 
       sum(tbl.num_orders) num_orders, 
       sum(tbl.total_money) total_money
FROM (    
      SELECT customer.id, 
             customer.name,        
             count(DISTINCT p_orders.id) AS num_orders,        
             sum(p_items.price) AS total_money    
      FROM customer        
            INNER JOIN p_orders 
                ON p_orders.id_customer = customer.id        
            INNER JOIN p_items 
                ON p_items.id_order = p_orders.id    
      GROUP BY customer.id, customer.name, p_orders.id_customer

      UNION

      SELECT customer.id, 
             customer.name,        
             count(DISTINCT h_orders.id) AS num_orders,
             sum(h_items.price) AS total_money    
      FROM  customer        
             INNER JOIN h_orders 
                 ON h_orders.id_customer = customer.id
             INNER JOIN h_items 
                 ON h_items.id_order = h_orders.id    
      GROUP BY customer.id, customer.name, h_orders.id_customer
    ) tbl
 GROUB BY tbl.id, tbl.name
 ORDER BY tbl.id ASC

Solution 3

Create a view with the union of the two queries but without the aggregate functions. USe Union All as the same record is not in both tables and you don't need the server to waste time looking to see that.You will probaly have other times you want to access both tables in a query.

Then write your query using the view.

view code would be something like (you may want other fields for other purposes as well:

Create view customerOrders
AS
SELECT      customer.id as CustomerID,  customer.name, p_orders.id as OrderID,  p_items.price  as price
FROM        customer        
INNER JOIN  p_orders ON p_orders.id_customer = customer.id        
INNER JOIN  p_items ON p_items.id_order = p_orders.id
union all
SELECT      customer.id,  customer.name,  h_orders.id as id, H_items.price           
FROM        customer        
INNER JOIN  h_orders ON h_orders.id_customer = customer.id        
INNER JOIN  h_items ON h_items.id_order = h_orders.id

then the call for your query would be something like (none of this is tested may need adjustment)

SELECT    CustomerID,    customer.name,    count(DISTINCT OrderID) AS num_orders,    
sum(price) AS total_money
FROM    customerOrders
GROUP BY     CustomerID,    customer.name
ORDER BY    CustomerID
Share:
21,587

Related videos on Youtube

Strae
Author by

Strae

I can accept failure, everyone fails at something - But I can't accept not trying. -- You HAVE to assume your visitor is a maniac serial killer, out to destroy your application. And you have to prevent it. Hire me

Updated on August 11, 2020

Comments

  • Strae
    Strae over 3 years

    I have 5 tables:

    customers id - name

    p_orders id - id_customer - code - date

    p_items id - id_order - description - price

    and h_orders and h_items, that are exactly the copy of p_orders and p_items.

    When the p_ tables reach a big amount of rows, i move the oldest to the h_ tables.. they due as history.

    So, my problem is: how to retrieve the data from both the p_ tables and h_ considering them as one unique table?

    For example, i want to retrieve the number of orders for each customer, and the total price (of all the customer's orders), and i use that query:

    SELECT
        customer.id,
        customer.name,
        count(DISTINCT p_orders.id) AS num_orders,
        sum(p_items.price) AS total_money
    FROM
        customer
        INNER JOIN p_orders ON p_orders.id_customer = customer.id
        INNER JOIN p_items ON p_items.id_order = p_orders.id
    GROUP BY
        customer.id,
        customer.name,
        p_orders.id_customer
    ORDER BY
        customer.id
    

    it works just for one 'set' of tables (p_ or h_)..but i want them both.

    I've tryed to use an UNION:

    (
        SELECT
            customer.id,
            customer.name,
            count(DISTINCT p_orders.id) AS num_orders,
            sum(p_items.price) AS total_money
        FROM
            customer
            INNER JOIN p_orders ON p_orders.id_customer = customer.id
            INNER JOIN p_items ON p_items.id_order = p_orders.id
        GROUP BY
            customer.id,
            customer.name,
            p_orders.id_customer
    )
    UNION
    (
        SELECT
            customer.id,
            customer.name,
            count(DISTINCT h_orders.id) AS num_orders,
            sum(h_items.price) AS total_money
        FROM
            customer
            INNER JOIN h_orders ON h_orders.id_customer = customer.id
            INNER JOIN h_items ON h_items.id_order = h_orders.id
        GROUP BY
            customer.id,
            customer.name,
            h_orders.id_customer
    )
    ORDER BY id ASC
    

    This one works, but if a customer have orders both in the p_ tables and in the h_ tables, i'll have 2 rows for that customer with 2 different num_orders and total_money (respectively coming from p_ tables and h_ tables)

    I've tryed to add a GROUP BY id outside the union:

    (
        --SELECT 2
    )
    UNION
    (
        --SELECT 1
    )
    GROUP BY id
    ORDER BY id ASC
    

    but the query fail with ERROR: syntax error at or near "GROUP" at character 948, seem like GROUP BY cannot be used in that way.

    Any suggestion?

    EDIT:

    For uriDium, yes, all the tables have the id column as primary key, and the referred fields (aka p_orders.id_customer) are foreign keys too. Here the test db structure dump (i added some indexes and foreign keys after the table creation, but i dont think that this mean something):

    CREATE TABLE customer (
        id serial NOT NULL,
        name character(50)
    );
    CREATE TABLE p_orders (
        id serial NOT NULL,
        id_customer integer NOT NULL,
        date date DEFAULT now(),
        code character(5)
    );
    CREATE TABLE p_items (
        id serial NOT NULL,
        id_order integer NOT NULL,
        descr character(250),
        price money
    );
    CREATE TABLE h_orders (
        id integer NOT NULL,
        id_customer integer NOT NULL,
        date date,
        code character(5)
    );
    CREATE TABLE h_items (
        id integer NOT NULL,
        id_order integer NOT NULL,
        descr character(250),
        price money
    );
    CREATE UNIQUE INDEX id_h_orders ON h_orders USING btree (id);
    CREATE INDEX id_h_o_c ON h_orders USING btree (id_customer);
    CREATE UNIQUE INDEX id_items_h ON h_items USING btree (id);
    CREATE INDEX id_ordinr_dsve ON h_items USING btree (id_order);
    
    ALTER TABLE ONLY customer
        ADD CONSTRAINT customer_pkey  (id);
    ALTER TABLE ONLY p_orders
        ADD CONSTRAINT p_orders_pkey PRIMARY KEY (id);
    ALTER TABLE ONLY p_items
        ADD CONSTRAINT p_items_pkey PRIMARY KEY (id);
    ALTER TABLE ONLY stats
        ADD CONSTRAINT stats_pkey PRIMARY KEY (id);
    ALTER TABLE ONLY p_orders
        ADD CONSTRAINT "$1" FOREIGN KEY (id_customer) REFERENCES customer(id) ON DELETE CASCADE;
    ALTER TABLE ONLY p_items
        ADD CONSTRAINT "$1" FOREIGN KEY (id_order) REFERENCES p_orders(id) ON DELETE CASCADE;
    ALTER TABLE ONLY h_orders
        ADD CONSTRAINT "$1" FOREIGN KEY (id_customer) REFERENCES customer(id) ON DELETE CASCADE;
    ALTER TABLE ONLY h_items
        ADD CONSTRAINT "$1" FOREIGN KEY (id_order) REFERENCES h_orders(id) ON DELETE CASCADE;
    
  • Eric Petroelje
    Eric Petroelje almost 15 years
    I believe that with a union, all fields are considered during duplicate elimination - not just the keys.
  • Strae
    Strae almost 15 years
    Yes - Edited with the test database tructure
  • Strae
    Strae almost 15 years
    No, when i move the data from p_* to h_* i keep the same values of the columns, even the id - and in both tables is defined as unique primary key, the only difference is that in the h_* table there is no the sequence for auto increment the id value on every insert
  • dburges
    dburges almost 15 years
    "Hopefully your id columns are intelligently designed and not just auto-incrmenting or identity columns." Identity ids are an intelligent way to design in most business applications.
  • Strae
    Strae almost 15 years
    What do you exactly mean, HLGEM? You think i do a wrong (or inefficent) use of the ids? please explain.. im here to learn!
  • Tom H
    Tom H almost 15 years
    @DaNieL - Since you're keeping the ID values, you should be in good shape
  • Strae
    Strae almost 15 years
    I must keep the original id from p_*, becose there are other tables using the orders and items data :)