Should many to many tables have a primary key?

17,505

Solution 1

I agree with everything Oded said except

"It can't reasonably be used as a foreign key either."

In this case it's a pick your poison, the mapping table absolutely can be a parent, it's just a matter of the child using a multicolumn FK or not.

Take a simple case of Car and color. Each Year Auto Makers have a certain pallet of colors and each model only comes in limited number of those colors. Many - Many :: Colors to Cars models

So now design the Order table where new cars orders are stored. Clearly Color and Model will be on the Order table. If you make a FK to each of those tables, the database will permit an incorrect model/color combination to be selected. (Of course you can enforce this with code, you can't do so declaratively.) If you make the parent be the many:many table, you'll only get combinations that have been specified.

SO would you rather have a multicolumn FK and point to a PK built on both ModelID and ColorID or do you want a single column FK?

Pick your poison.

EDIT

But if it's not a parent of something, no table needs a surrogate key.

Solution 2

Such a surrogate key adds nothing except overhead.

Use the natural keys, make them a composite primary key if you care about duplication in this table.

To expand:

In the application, this key will be meaningless and will remain unused.

In the database, it will have no function, as you can't reasonably use it in a query for any type of meaningful result.

It can't reasonably be used as a foreign key either.

Solution 3

If the table tracking the many to many relationship has it's own primary key and that key is used as a foreign key anywhere else in the database then you create a dependency on that relationship. The relationship can never be removed.

For instance in the car color example, if the color for a car is ever discontinued (removed from the many to many relationship table) then any table (i.e. purchase history) referencing the primary key would be broken.

Solution 4

I have done it both ways. Sometimes it is beneficial for adding a feature down the road. For instance, if there was ever a time that a row in the table would ever contain anything more than just the 2 id's. If you don't lack space I would put one in there just because it can't hurt. Sometimes it can interfere with ORM tools like hibernate or ADO.NET but that is minor.

So to sum it up... PROS 1. Allows potential future growth.

CONS 1. Space 2. Confuses some ORM tools.

Solution 5

The term "join table" is often used but I don't think I've ever seen it properly defined or explained. Personally I avoid using that term. As I understand it, a "join table" means any table with two foreign keys (or possibly more than two?).

I think the criteria for selecting keys in a table with more than one foreign key should be much the same as in any other table. Ask yourself what dependencies you need to enforce, what is unique and irreducible. Select keys on the criteria of Familiarity, Stability and Simplicity. Add surrogate keys only when you have a good reason to.

Share:
17,505
ashes999
Author by

ashes999

A long-time game developer and professional software developer.

Updated on June 03, 2022

Comments

  • ashes999
    ashes999 about 2 years

    If I have two objects that have a many-to-many relationship, I would typically model them in my database schema with a many-to-many table to relate the two. But should that many-to-many table (or "join table") have a primary key of its own (integer auto-incremented)?

    For example, I might have tables A and B, each with an ID, and a table called A_B that has a foreign key tuple of (A_ID, B_ID). But should A_B have a primary key auto-incremented ID column of its own, or not?

    What are the advantages and disadvantages of adding it? I personally like natural keys for many-to-many joins. But what added benefit would a primary key add?

  • David
    David over 13 years
    Regarding potential future growth, note that as soon as you add meaningful data to the association table then it becomes more than just persistence logic. It becomes its own entity. This could cause headaches in the domain. I think it's best to separate business entities from persistence logic and keep them separated. If you need to add data to an association table, re-think from a wider perspective what it is you're doing and if there's a better way, because the logical impact is greater than just adding a column to a table.
  • ashes999
    ashes999 over 13 years
    I concur. Although it's hard to think of an example where an M:M relation would become an entity with a unique primary key.
  • ashes999
    ashes999 over 13 years
    Yes, the only use I can see is that maybe if you want to limit deletion to one record -- but that would cause an additional lookup for the ID, assuming you had both foreign key IDs. I concur.
  • ashes999
    ashes999 over 13 years
    Precisely my argument. Real objects should have IDs to identify them as such.
  • David
    David over 13 years
    @ashes999: Ya, going to the original question I can't really think of a benefit to adding a separate primary key to such a table. @dko's suggestion is, in my opinion, more of a risk than a benefit. It may provide something of immediate value in certain situations, but I wouldn't be comfortable with the doors it opens up.
  • Oded
    Oded over 13 years
    @ashes999 - You shouldn't have duplicate rows in this table (or any table, for that matter). And if you don't, then deleting a single row will simply require both foreign keys.
  • dko
    dko over 13 years
    Fair enough, I've just never had a headache from deciding to have done it that way. I have had Entities that acted as a M:M relation that contained other useful data. This data sometimes was edited in it's own webpage where having a single ID to pass was handy.
  • ashes999
    ashes999 over 13 years
    I see what you're saying. For example, a blog that allows multiple authors could have a authors_post table, with a (later added) author_added_on_date field so we can go back into revisions and see who had permission when. Hmm. I would still say "add the surrogate key when you need it, not before."
  • dko
    dko over 13 years
    ahes999 - Yes that is the exact kind of situation i refer to. I would agree, you can always add one if you need it.