How to implement a many-to-many relationship in PostgreSQL?
The SQL DDL (data definition language) statements could look like this:
CREATE TABLE product (
product_id serial PRIMARY KEY -- implicit primary key constraint
, product text NOT NULL
, price numeric NOT NULL DEFAULT 0
);
CREATE TABLE bill (
bill_id serial PRIMARY KEY
, bill text NOT NULL
, billdate date NOT NULL DEFAULT CURRENT_DATE
);
CREATE TABLE bill_product (
bill_id int REFERENCES bill (bill_id) ON UPDATE CASCADE ON DELETE CASCADE
, product_id int REFERENCES product (product_id) ON UPDATE CASCADE
, amount numeric NOT NULL DEFAULT 1
, CONSTRAINT bill_product_pkey PRIMARY KEY (bill_id, product_id) -- explicit pk
);
I made a few adjustments:
-
The n:m relationship is normally implemented by a separate table -
bill_product
in this case. -
I added
serial
columns as surrogate primary keys. In Postgres 10 or later consider anIDENTITY
column instead. See:- Safely rename tables using serial primary key columns
- Auto increment table column
- https://www.2ndquadrant.com/en/blog/postgresql-10-identity-columns/
I highly recommend that, because the name of a product is hardly unique (not a good "natural key"). Also, enforcing uniqueness and referencing the column in foreign keys is typically cheaper with a 4-byte
integer
(or even an 8-bytebigint
) than with a string stored astext
orvarchar
. -
Don't use names of basic data types like
date
as identifiers. While this is possible, it is bad style and leads to confusing errors and error messages. Use legal, lower case, unquoted identifiers. Never use reserved words and avoid double-quoted mixed case identifiers if you can. -
"name" is not a good name. I renamed the column of the table
product
to beproduct
(orproduct_name
or similar). That is a better naming convention. Otherwise, when you join a couple of tables in a query - which you do a lot in a relational database - you end up with multiple columns named "name" and have to use column aliases to sort out the mess. That's not helpful. Another widespread anti-pattern would be just "id" as column name.
I am not sure what the name of abill
would be.bill_id
will probably suffice in this case. -
price
is of data typenumeric
to store fractional numbers precisely as entered (arbitrary precision type instead of floating point type). If you deal with whole numbers exclusively, make thatinteger
. For example, you could save prices as Cents. -
The
amount
("Products"
in your question) goes into the linking tablebill_product
and is of typenumeric
as well. Again,integer
if you deal with whole numbers exclusively. -
You see the foreign keys in
bill_product
? I created both to cascade changes:ON UPDATE CASCADE
. If aproduct_id
orbill_id
should change, the change is cascaded to all depending entries inbill_product
and nothing breaks. Those are just references without significance of their own.
I also usedON DELETE CASCADE
forbill_id
: If a bill gets deleted, its details die with it.
Not so for products: You don't want to delete a product that's used in a bill. Postgres will throw an error if you attempt this. You would add another column toproduct
to mark obsolete rows ("soft-delete") instead. -
All columns in this basic example end up to be
NOT NULL
, soNULL
values are not allowed. (Yes, all columns - primary key columns are definedUNIQUE NOT NULL
automatically.) That's becauseNULL
values wouldn't make sense in any of the columns. It makes a beginner's life easier. But you won't get away so easily, you need to understandNULL
handling anyway. Additional columns might allowNULL
values, functions and joins can introduceNULL
values in queries etc. -
Read the chapter on
CREATE TABLE
in the manual. -
Primary keys are implemented with a unique index on the key columns, that makes queries with conditions on the PK column(s) fast. However, the sequence of key columns is relevant in multicolumn keys. Since the PK on
bill_product
is on(bill_id, product_id)
in my example, you may want to add another index on justproduct_id
or(product_id, bill_id)
if you have queries looking for a givenproduct_id
and nobill_id
. See: -
Read the chapter on indexes in the manual.
![Radu Gheorghiu](https://i.stack.imgur.com/I4r3M.jpg?s=256&g=1)
Radu Gheorghiu
First ever day at work == first ever day on StackOverflow. (13 Mar 2012) For my full professional experience please have a look at my LinkedIn profile. If you write questions on StackOverflow and you want to post sample data structured into tables, use this link. My interests revolve around everything data oriented. Some of the general subjects I'm interested in, and all their aspects are: Relational Databases - SQL Server & Oracle Artificial Intelligence, Machine Learning Big Data tools (Spark, Hadoop) Some of the things I've played with when starting seriously with StackOverflow: Who from my location am I better than at SQL? Who from my location is better than me at SQL? (scores to beat) Am I the best SQL Developer from Cluj Napoca yet? Milestones: over 1000 rep (1009) on 18th July 2013 over 1500 rep (1518) on 9th October 2013 over 2000 rep (2003) on 30th October 2013 over 2500 rep (2506) on 23rd January 2014 over 3000 rep (3003) on 26th May 2014 over 3500 rep (3507) on 14th October 2014 over 4000 rep (4016) on 13th December 2014 over 4500 rep (4507) on 4th May 2015 over 5000 rep (5004) on 10th June 2015
Updated on August 01, 2020Comments
-
Radu Gheorghiu almost 4 years
I believe the title is self-explanatory. How do you create the table structure in PostgreSQL to make a many-to-many relationship.
My example:
Product(name, price); Bill(name, date, Products);
-
Marc B over 12 yearsremove products from the bill table, create a new table called "bill_products" with two fields: one pointing at products, one pointing at bill. make those two fields this new table's primary key.
-
Radu Gheorghiu over 12 yearsSo bill_products(bill, products); ? And both of them PK?
-
Marc B over 12 yearsyeah. they'd be individually an FK pointing at their respective tables, and together they'd be the PK for the new table.
-
Radu Gheorghiu over 12 yearsSo, bill_product(product references product.name, bill references bill.name , (product, bill) primary key) ?
-
Marc B over 12 yearsThey'd point to whateve the PK fields of the Product and Bill tables would be.
-
mercury over 2 yearspivot table between the tables
-
-
codyLine over 9 yearsHow can I create an index for the mapping table
bill_product
? Normally it should looks like:CREATE INDEX idx_bill_product_id ON booked_rates(bill_id, product_id)
. Is this right? -
Erwin Brandstetter over 9 years@codyLine: This index is created automatically by the PK.
-
Christian over 8 years@ErwinBrandstetter: Shouldn't be created an index on bill_product for product_id column?
-
Erwin Brandstetter over 8 years@ChristianB.Almeida: That's useful in many cases, yes. I added a bit about indexing.
-
Jakov over 6 yearsWhy is the amount column inside the composite table? Shouldn't it be in the bill table?
-
Erwin Brandstetter about 6 years@Jakov: There is only 1 row for each bill in table
bill
. We need the amount per added item inbill_product
. -
Jakov about 6 years@ErwinBrandstetter Oh - the amount of same items on a bill. I thought it was the total price of the bill. My bad.