Why is it not good to have a primary key on a join table?

12,157

Solution 1

Some notes:

  1. The combination of category_id and post_id is unique in of itself, so an additional ID column is redundant and wasteful
  2. The phrase "not good to have a primary key" is incorrect in the screencast. You still have a Primary Key -- it is just made up of the two columns (e.g. CREATE TABLE foo( cid, pid, PRIMARY KEY( cid, pid ) ). For people who are used to tacking on ID values everywhere this may seem odd but in relational theory it is quite correct and natural; the screencast author would better have said it is "not good to have an implicit integer attribute called 'ID' as the primary key".
  3. It is redundant to have the extra column because you will place a unique index on the combination of category_id and post_id anyway to ensure no duplicate rows are inserted
  4. Finally, although common nomenclature is to call it a "composite key" this is also redundant. The term "key" in relational theory is actually the set of zero or more attributes that uniquely identify the row, so it is fine to say that the primary key is category_id, post_id
  5. Place the MOST SELECTIVE column FIRST in the primary key declaration. A discussion of the construction of b(+/*) trees is out of the scope of this answer ( for some lower-level discussion see: http://www.akadia.com/services/ora_index_selectivity.html ) but in your case, you'd probably want it on post_id, category_id since post_id will show up less often in the table and thus make the index more useful. Of course, since the table is so small and the index will be, essentially, the data rows, this is not very important. It would be in broader cases where the table is wider.

Solution 2

A DBA would tell you that the primary key in this case is actually the combination of the two FK columns. Since Rails/ActiveRecord doesn't play nice with composite PKs (by default, at least), that may be the reason.

Solution 3

The combination of foreign keys can be a primary key (called a composite primary key). Personally I favour using a technical primary key instead of that (auto number field, sequence, etc). Why? Well, it makes it much easier to identify the record, which you may need to do if you're going to delete it.

Think about it: if you're going to present a Webpage of all the linkages, having a primary key to identify the record makes it much easier.

Solution 4

Basically because there's no need for it. The combination of the two foreign key field adequately uniquely identifies any row.

But that merely says why it's not a Good Idea.... but why would it be a Bad Idea?

Consider the overhead adding a identity column would add. The table would take up 50% more disk space. Worse is the index situation. With a identity field, you have to maintain the identity count, plus a second index. You'll be tripling the disk space and tripling the work the needs to be performed on every insert. With the only advantage being a slightly shorter WHERE clause in a DELETE command.

On the other hand, If the composite key fields are the entire table, then the index can be the table.

Solution 5

It is a bad idea not to have a primary key on any table, period (if the DBMS is a relational DBMS - or an SQL DBMS). Primary keys are a crucial part of the integrity of your database.

I suppose if you don't mind your database being inaccurate and providing incorrect answers every so often, then you could do without...but most people want accurate answers from their DBMS and for such people, primary keys are crucial.

Share:
12,157
pez_dispenser
Author by

pez_dispenser

Updated on June 18, 2022

Comments

  • pez_dispenser
    pez_dispenser almost 2 years

    I was watching a screencast where the author said it is not good to have a primary key on a join table but didn't explain why.

    The join table in the example had two columns defined in a Rails migration and the author added an index to each of the columns but no primary key.

    Why is it not good to have a primary key in this example?

    create_table :categories_posts, :id => false do |t|
      t.column :category_id, :integer, :null => false
      t.column :post_id, :integer, :null => false
    end
    add_index :categories_posts, :category_id
    add_index :categories_posts, :post_id
    

    EDIT: As I mentioned to Cletus, I can understand the potential usefulness of an auto number field as a primary key even for a join table. However in the example I listed above, the author explicitly avoids creating an auto number field with the syntax ":id => false" in the "create table" statement. Normally Rails would automatically add an auto-number id field to a table created in a migration like this and this would become the primary key. But for this join table, the author specifically prevented it. I wasn't sure why he decided to follow this approach.