MySQL JOIN 3 tables using multiple columns/keys

13,554

Solution 1

Taken from your comment above

A USER may select many products, add them to their CART; a single USER may have multiple CARTS and at the end of the event, they can EMAIL the cart to themselves; the ACTIONS of the user are stored in the actions table

This is how I see your the structure (having in mind your data)

+---------------------+     +---------------------+     +---------------------+
| users               |     | carts               |     | actions             |
+---------------------+     +---------------------+     +---------------------+
| user_id       [PK]  |--|  | cart_id       [PK]  |     | impression_id [PK]  |
| email               |  |--| user_id       [FK]  |     | action_name         |
|                     |     | product_id    [FK]  |  |--| session_id    [FK]* |
+---------------------+     | session_id    [FK]* |--|  |                     |
                            |                     |     +---------------------+
                            +---------------------+    

As you can see below, I'm joining first with carts and them with actions because only the table carts has both, user and session data.

The [FK]* next to the session_id on carts and actions could seem as a foreign key but in this case it's not - 'cause there's no separate table for sessions where it would be placed as an PK (primary key)

You asked about join - it is the same as inner join. INNER JOIN creates a new result table by combining column values of two tables (A and B) based upon the join-predicate. The query compares each row of A with each row of B to find all pairs of rows which satisfy the join-predicate.

This is a possible content of the tables

+------------------------+
| users                  |
+------------------------+
| id   | email           |
+------+-----------------+
| 1    | [email protected]  |
| 2    | [email protected] |
| 3    | [email protected]  |
+------+-----------------+

+------------------------------------------+
| carts                                    |
+------------------------------------------+
| id   | user_id | product_id | session_id |
+------+---------+------------+------------+
| 1    | 1       | 5          | 1aaaa      |
| 2    | 2       | 5          | 2ffff      |
| 3    | 3       | 8          | 3ddddd     |
| 4    | 1       | 5          | 1aaaaa     |
| 5    | 3       | 9          | 3bbbbb     |
| 6    | 1       | 6          | 1ccccc     |
+------+---------+------------+------------+

+-------------------------------+
| actions                       |
+-------------------------------+
| id   | name      | session_id |
+------+-----------+------------+
|  1   | ADD       | 1aaaa      |
|  2   | ADD       | 2ffff      |
|  3   | SENDMAIL  | 3ddddd     |
|  4   | ADD       | 3ddddd     |
|  5   | SENDMAIL  | 2ffff      |
|  6   | ADD       | 1aaaaa     |
|  7   | REMOVE    | 3ddddd     |
|  8   | ADD       | 1ccccc     |
|  9   | ADD       | 3bbbbb     |
| 10   | SENDMAIL  | 3bbbbb     |
+------+-----------+------------+

As you can see, there's six products in the table carts and exactly six add actions in the table actions. Furthermore, as you can see user with an id=1 bought three products but not at the same time, since there are two sessions; user with an id=3 as well, bought these two products in different times etc...

The SQL statement

SELECT u.user_id, c.session_id, c.cart_id, a.impression_id, a.action_name, u.email
FROM users AS u
INNER JOIN carts AS c ON c.user_id = u.user_id
INNER JOIN actions AS a ON a.session_id = c.session_id
ORDER BY u.user_id, c.session_id, c.cart_id

Results:

+---------+------------+---------+---------------+-------------+-----------------+
| user_id | session_id | cart_id | impression_id | action_name | email           |
+---------+------------+---------+---------------+-------------+-----------------+
| 1       | 1aaaa      | 1       | 1             | ADD         | [email protected]  |
| 1       | 1aaaa      | 1       | 6             | ADD         | [email protected]  |
| 1       | 1aaaa      | 4       | 1             | ADD         | [email protected]  |
| 1       | 1aaaa      | 4       | 6             | ADD         | [email protected]  |
| 1       | 1cccc      | 6       | 8             | ADD         | [email protected]  |
| 2       | 2ffff      | 2       | 5             | SENDMAIL    | [email protected] |
| 2       | 2ffff      | 2       | 2             | ADD         | [email protected] |
| 3       | 3bbbb      | 5       | 9             | ADD         | [email protected]  |
| 3       | 3bbbb      | 5       | 10            | SENDMAIL    | [email protected]  |
| 3       | 3dddd      | 3       | 3             | SENDMAIL    | [email protected]  |
| 3       | 3dddd      | 3       | 4             | ADD         | [email protected]  |
| 3       | 3dddd      | 3       | 7             | REMOVE      | [email protected]  |
+---------+------------+---------+---------------+-------------+-----------------+

Note: There's no guarantee for session uniqueness.

(Updated) Working SQL Fiddle


UPDATE: (Finding and deleting duplicates)

I've updated the SQL Fiddle in order to simulate duplicate records (when user added the same product within the same session). With the following statement you'll be able to retrieve those duplicated rows.

SELECT c.card_id, c.user_id, c.product_id, c.session_id, a.action_name, a.impression_id
FROM cards As c
INNER JOIN actions AS a ON a.session_id = c.session_id
GROUP BY c.user_id, c.product_id, c.session_id, a.action_name
HAVING count(*) > 1

Results:

+---------+------------+------------+------------+-------------+-----------------+
| card_id | user_id    | product_id | session_id | action_name | impression_id   |
+---------+------------+------------+------------+-------------+-----------------+
| 1       | 1          | 5          | 1aaaa      | ADD         | 1               |
| 6       | 1          | 6          | 1cccc      | ADD         | 8               |
+---------+------------+------------+------------+-------------+-----------------+

In the SELECT part of the statement above you may omit everything except card_id and impression_id. Deleting these two duplicates in one statement is a bit tricky since you can't modify the same table selected in a sub-query within the same query. I would avoid the tricky part in this case (which involves another inner sub-query) and would delete duplicates using separate statements as following

-- delete duplicates from cards
--
DELETE FROM WHERE card_id IN (1,6)

-- delete duplicates from actions
--
DELETE FROM WHERE card_id IN (1,8)

Even better, you could check if the user already has been added a selected product and don't add it twice.

Solution 2

Excuse my MySql syntax, as I don't know it :-p But this is the idea

SELECT u.userId, a.session_id, c.cartId, a.impressionAction, a.impressionId, u.email
FROM Carts c 
JOIN Users u on u.userId = c.UserId
JOIN Actions a on a.session_id = c.session_id

This will just merge everything together, and you'll have duplicate cart records if you have many to 1 relationships

Share:
13,554
eager_learner313
Author by

eager_learner313

Updated on June 04, 2022

Comments

  • eager_learner313
    eager_learner313 almost 2 years

    Complete newbie to mySQL. So any help will be appreciated.

    I have 3 tables -- carts, users, actions.

    carts:
    +------------+-------------+-------+
    | cartId     | session_id  | userId| 
    +------------+-------------+-------+
    
    users:
    +----------+-------------+
    | usedId   | email       |
    +----------+-------------+
    
    actions:
    +-------------+------------------+---- ---------+
    | session_id  | impressionAction | impressionId | 
    +-------------+------------------+-----+--------+
    

    In carts, there is one session_id per line.

    In users, there is one userId per line.

    In actions, there are multiple lines per session_id counting for all the actions for that session.

    I would like to JOINthe three tables getting the output to be something like

    +------+-------------+--------+------------------+--------------+-------+
    userId | session_id  | cartId | impressionAction | impressionId | email |
    +------+-------------+--------+------------------+--------------+-------+
    

    Where there will be multiple lines per userId and session_id; essentially a flattened file. I think if we JOIN carts and users on userId resulting in say A and then JOIN A and actions' onsession_id`, we are home.

    A sample expected output is:

    +------------+-------------+--------+------------------+--------------+---------+
    userId       | session_id  | cartId | impressionAction | impressionId | email   |
    +------------+-------------+--------+------------------+--------------+---------+
    | 1234       | abc3f45     | 0001   | LOGIN            | 2032         |[email protected]|
    | 1234       | abc3f45     | 0001   | ADD              | 4372         |[email protected]|
    | 1234       | abc3f45     | 0001   | ADD              | 4372         |[email protected]|
    | 1234       | abc3f45     | 0001   | SENDMAIL         | [email protected]    |[email protected]| 
    | 4567       | def4rg4     | 0002   | LOGIN            | 2032         |[email protected]|
    | 4567       | def4rg4     | 0002   | ADD              | 4372         |[email protected]|
    | 4567       | def4rg4     | 0002   | REMOVE           | 3210         |[email protected]|
    +------------+-------------+--------+------------------+--------------+---------+** 
    

    I don't know how to JOIN 3 tables without one common key. I don't even know what type of join it is called.

    Essentially, we are trying to join 3 tables with non-overlapping keys, gathering one common key through the first JOIN and then joining the intermediate with the third one. Is this called a CROSS JOIN? If no, is there a name?

  • Jenn
    Jenn almost 10 years
    There are many things that are common between the various DBMSs. All of the keywords in your query are allowed in MySql.
  • eager_learner313
    eager_learner313 almost 10 years
    @Relevant: how do I take out the duplicates and have the output similar to one the I posted?
  • Relevant
    Relevant almost 10 years
    You have duplicates in the sample output. What I mean by duplicates is that you have cartid 0001 listed multiple times.
  • Relevant
    Relevant almost 10 years
    So then does this solution get you what you need?
  • eager_learner313
    eager_learner313 almost 10 years
    Real quick, as far as the mechanism of joins go, is the following happening? carts is joined with users to form, say, carts_user table which then gets joined with actions to form the final table carts_users_actions? also, does the order of the tables matter? Could I have done: FROM carts AS c INNER JOIN users AS u ON c.user_id = u.user_id INNER JOIN actions AS a ON a.session_id = c.session_id ORDER BY u.user_id, c.session_id, c.cart_id ? that is, put carts as the first table?
  • hex494D49
    hex494D49 almost 10 years
    @eager_learner313 In general, join order does matter but not if you use inner join; though, that isn't a proper perspective, at least to me, since the user is the one to start an action, not the cart itself :) When using join think like Honestly, I was thinking about this, and somehow I don't like joining tables on non primary / foreign keys. (c.session <-> a.session in this case);
  • hex494D49
    hex494D49 almost 10 years
    Furthermore, there's no guarantee for session uniqueness (if you use the one provided out-of-the-box) and it could cause performance issue since it's a varchar value. What do you use on server-side (php, java, c#) and could you make a print-screen of your database schema - maybe I could suggest a better design :)
  • eager_learner313
    eager_learner313 almost 10 years
    thanks for the clarifications! I think there were some typos to the message starting with "In general, join order".. it stops making sense after the first sentence or so. Can you fix that? :)
  • hex494D49
    hex494D49 almost 10 years
    @eager_learner313 Oh sorry, I see... the half of sentence is gone. I wanted to say, when using join think like you're making a new table, then this table can be joined based on some key to another, ans so one... Another advice, don't use varchar for session_id in this case - generate an integer or convert the given varchar session to an integer; and mark it as unique column.
  • eager_learner313
    eager_learner313 almost 10 years
    a follow-up question. I am seeing that the same product gets added TWICE in a single session by mistake. I would like to remove duplicates of the impressionId where ImpressionAction = ADDfor the same cartId. Not that the same product may be added in 2 different carts under a single session. For eg., session 1aaa can have the impressionId =6 (which is the product ID) added twice for the cartId = 1. How do I remove this duplicate pair?
  • hex494D49
    hex494D49 almost 10 years
    @eager_learner313 Hi :) Check the updated answer; the last section about finding and deleting duplicated records.
  • eager_learner313
    eager_learner313 almost 10 years
    You are so cool! Thanks for the update. However, I think it quite get it. Why is cartId = 6 showing up? It's the same userId but under different session_id. Also, I am not quite sure about how to go about and manually deleting the duplicates as mentioned in your deleting duplicates from carts and actions. If we can remove duplicates from actions and then JOIN, we are home. Essentially, only have distinct impression_ids under any single session_id when impressionAction = ADD. Is there any way we may chat over here or on any other medium? I think that will be better.
  • hex494D49
    hex494D49 almost 10 years
    @eager_learner313 This is a general statement for retrieving duplicate records SELECT column FROM table GROUP BY column HAVING count(column) > 1 I was adding another duplicate record and that's why there are two records showing up (card_id 1 and 6). Since card_id and impression_id are auto-increment columns you should check user_id, product_id, session_id and action_name in order to find duplicates. Although finding and deleting might be written as a procedure I prefer doing it using a helper routine in PHP (in this case). I'll update the answer in the meantime.
  • eager_learner313
    eager_learner313 almost 10 years
    Got it! I am on EST. What time zone are you on?
  • eager_learner313
    eager_learner313 over 9 years
    sent you an email. Did you get it by any chance?
  • hex494D49
    hex494D49 over 9 years
    @eager_learner313 I just saw it, 'cause I don't check this account every day. Let me read it and I'll write you back.
  • eager_learner313
    eager_learner313 over 9 years
    Is there an email that you check more frequently? If so, would you mind sharing that in a reply to my email? (NOT HERE)
  • hex494D49
    hex494D49 over 9 years
    @eager_learner313 I replied to your last email. You didn't get it?
  • eager_learner313
    eager_learner313 over 9 years
    Nope. Just checked it. Any chance you can resend? Sorry about this; not sure why I am not getting it. I also checked my SPAM folder