MySQL: Understanding mapping tables

25,427

Solution 1

When using many-to-many relationships, the only realistic way to handle this is with a mapping table.

Lets say we have a school with teachers and students, a student can have multiple teachers and visa versa.

So we make 3 tables

student
  id unsigned integer auto_increment primary key
  name varchar

teacher
  id unsigned integer auto_increment primary key
  name varchar

link_st
  student_id integer not null
  teacher_id integer not null
  primary key (student_id, teacher_id)

The student table will have 1000 records
The teacher table will have 20 records
The link_st table will have as many records as there are links (NOT 20x1000, but only for the actual links).

Selection
You select e.g. students per teacher using:

SELECT s.name, t.name 
FROM student
INNER JOIN link_st l ON (l.student_id = s.id)   <--- first link student to the link-table
INNER JOIN teacher t ON (l.teacher_id = t.id)   <--- then link teacher to the link table.
ORDER BY t.id, s.id

Normally you should always use an inner join here.

Making a link
When you assign a teacher to a student (or visa versa, that's the same). You only need to do:

INSERT INTO link_st (student_id, teacher_id) 
   SELECT s.id, t.id 
   FROM student s 
   INNER JOIN teacher t ON (t.name = 'Jones')
   WHERE s.name = 'kiddo'

This is a bit of a misuse of an inner join, but it works as long as the names are unique.
If you know the id's, you can just insert those directly of course.
If the names are not unique this will be a fail and should not be used.

How to avoid duplicate links
It's very important to avoid duplicate links, all sorts of bad things will happen if you have those.
If you want to prevent inserting duplicate links to your link table, you can declare a unique index on the link (recommended)

ALTER TABLE link_st
  ADD UNIQUE INDEX s_t (student_id, teacher_id); 

Or you can do the check in the insert statement (not really recommended, but it works).

INSERT INTO link_st (student_id, teacher_id) 
  SELECT s.id, t.id
  FROM student s
  INNER JOIN teacher t ON (t.id = 548)
  LEFT JOIN link_st l ON (l.student_id = s.id AND l.teacher_id = t.id)
  WHERE (s.id = 785) AND (l.id IS NULL)

This will only select 548, 785 if that data is not already in the link_st table, and will return nothing if that data is in link_st already. So it will refuse to insert duplicate values.

If you have a table schools, it depends if a student can be enrolled in multiple schools (unlikely, but lets assume) and teachers can be enrolled in multiple schools. Very possible.

table school
  id unsigned integer auto_increment primary key
  name varchar

table school_members
  id id unsigned integer auto_increment primary key
  school_id integer not null
  member_id integer not null
  is_student boolean not null

You can list all students in a school like so:

SELECT s.name
FROM school i
INNER JOIN school_members m ON (i.id = m.school_id)
INNER JOIN student s ON (s.id = m.member_id AND m.is_student = true)

Solution 2

When I join the Category table and Business table to create the mapping table would this then give me a table which contains every possible business and category relationship?

Yes.

Would I have to go through all listings (800,000) marking them as true or false?

No, you need to use the ON-clause to set join-conditions.

SELECT <columns> FROM categories as c 
INNER JOIN mapping AS m
    ON m.CategoryId = c.CategoryId
INNER JOIN businesses as b
    ON m.BusinessId = b.BusinessId

Solution 3

You should use mapping tables when you are trying to model a many-to-many or one-to-many relationship.

For example, in an address book application, a particular contact could belong to zero, one or many categories. If you set your business logic that a contact can only belong to one category, you would define your contact like:

Contact
--------------
contactid (PK)
name
categoryid (FK)

Category
--------------
categoryid (PK)
categoryname

But if you wanted to allow a contact to have more than one email address, use a mapping table:

Contact
--------------
contactid (PK)
name

Category
--------------
categoryid (PK)
categoryname

Contact_Category
--------------
contactid (FK)
categoryid (FK)

Then you can use SQL to retrieve a list of categories that a contact is assigned to:

select a.categoryname from Category a, Contact b, Contact_Category c where a.categoryid=c.categoryid and b.contactid=c.contactid and b.contactid=12345;

select a.categoryname 
from Category a
inner join Contact_Category c on a.categoryid=c.categoryid
inner join Contact b on b.contactid=c.contactid
where b.contactid=12345;

Solution 4

you only put the real relationships in the mapping table. so on average fi a business is in 2 categories, then in your example, there would only be 2000 records in the mapping table, not 800,000

"When I join the Category table and Business table to create the mapping table" you don't join those two tables to create the mapping table. You create an actual physical table.

Share:
25,427
Richard Bell
Author by

Richard Bell

Some things only ever make sense in my head

Updated on February 05, 2020

Comments

  • Richard Bell
    Richard Bell over 4 years

    When building a category navigation system for a business directory with a many to many relationship, I understand that it is good practise to create a mapping table.

    Category Table ( CategoryId, CategoryName )
    Business Table ( BusinessId, BusinessName )
    Category Mapping Table ( BusinessId, CategoryId )

    When I join the Category table and Business table to create the mapping table would this then give me a table which contains every possible business and category relationship?

    I have 800 categories and 1000 business listings. Would that then give me a table containing 800,000 possible relationships. If so how would I focus on only the relationships that exist? Would I have to go through all listings (800,000) marking them as true or false?

    I have been getting really confused about this so any help would be much appreciated.

  • Johan
    Johan almost 13 years
    please avoid using implicit where joins. They are confusing, error prone and bad for your mental health. Bury them in 1989 where they belong and use explicit joins instead.
  • Richard Bell
    Richard Bell almost 13 years
    Thank you very much Johan for a highly informative and well written answer. I think I am getting it. Just a quick question: After INNER JOIN link_st what does the l do? is that just an abbreviation for use in the ON clause?
  • phant0m
    phant0m almost 13 years
    It's a capital i, not an minor L. It aliases the table so you can refer to it without having to use the table name: l.student_id instead of link_st.student_id
  • Johan
    Johan almost 13 years
    @phant0m, it's a minor L, not a capital i, why would I alias a table called link with an i?
  • Johan
    Johan almost 13 years
    @Richard, I alias the table, because I'm lazy and don't want to type the full tablename.
  • phant0m
    phant0m almost 13 years
    Oh well.. that's a fail... I just got super-confused with all the contrasts between is, is not, capital not capital... In the example, I did it correct though, haha ---- Just switch the not to the other place :)
  • Johan
    Johan almost 13 years
    @phant0m, Ah well, using a minor l is a fail in and of itself I guess.
  • Richard Bell
    Richard Bell almost 13 years
    What would happen if you were then to scale this up and add a table called schools? Would you join this with the existing link_st table? so that when looking through records you could go SCHOOL > TEACHER > STUDENTS.
  • Richard Bell
    Richard Bell almost 13 years
    Great! thanks a lot Johan. In order to create the links will I have to write the insert statement for every relationship? I think I would create a form in php to insert rows to the database if this is the case.
  • Johan
    Johan almost 13 years
    @Richard: will I have to write the insert statement for every relationship? Yes.