How to use triggers to prevent duplicate records in PostgreSQL?

postgresql database-design triggers constraints plpgsql

11,795

I previously had used a unique index on the record's fields, but would like to learn how a trigger is written to accomplish this.

This is a misunderstanding. If a set of columns is supposed to be unique, use a UNIQUE constraint (or make it the PK) in any case. And be aware of a special role for NULL values:

The rest of the answer is largely outdated. Since Postgres 9.5 added UPSERT there is a simpler solution:

INSERT INTO tbl (col1, col2, col3, col4)
VALUES (1, 2, 3, 4)
ON     CONFLICT ON CONSTRAINT my_4_col_uni DO NOTHING;

The rest is for Postgres 9.4 or older

Triggers can help to enforce the constraint. But they fail to enforce uniqueness on their own due to inherent race conditions.

You can just let the unique constraint handle duplicate keys. You'll get an EXCEPTION for violations. To avoid exceptions most of the time¹ you can use a simple trigger:

CREATE OR REPLACE FUNCTION tbl_ins_up_before()
  RETURNS trigger AS
$func$
BEGIN

IF EXISTS (SELECT 1 FROM tbl
           WHERE (col1,     col2,     col3,     col4)
           = (NEW.col1, NEW.col2, NEW.col3, NEW.col4)) THEN
   RETURN NULL;
END IF;

RETURN NEW;

END
$func$  LANGUAGE plpgsql;

CREATE TRIGGER ins_up_before
BEFORE INSERT OR UPDATE OF col1, col2, col3, col4  -- fire only when relevant
ON tbl
FOR EACH ROW EXECUTE PROCEDURE tbl_ins_up_before();

¹There is an inherent race condition in the time slice between checking if a row already exists and actually inserting the row, which cannot be avoided unless you lock the table exclusively (very expensive). Details depend on the exact definition of your constraint (may be deferrable). So you might still get an exception if a concurrent transaction also finds (at virtually the same moment) that (1,2,3,4) is not there yet and inserts before you. Or the operation might get aborted, but the existing row is deleted before you can commit.

This cannot be fixed with row-level locking either, because you cannot lock rows that aren't there yet (predicate locking) in Postgres up to version 9.6.

You need a unique constraint, which enforces uniqueness at all times.

I would have the constraint and then use this query:

INSERT INTO tbl (col1, col2, col3, col4)
SELECT 1, 2, 3, 4
WHERE  NOT EXISTS (
   SELECT 1 FROM tbl
   WHERE (col1, col2, col3, col4) = (1, 2, 3, 4);

Similar for UPDATE.

You could encapsulate INSERT / UPDATE in a plpgsql function and trap duplicate key violations. Example:

Is SELECT or INSERT in a function prone to race conditions?

11,795

Alan Wayne

Updated on June 04, 2022

Comments

Alan Wayne almost 2 years
I wish to create a stored procedure (in plpgsql, PostgreSQL 9.1) that first checks to be sure that the record which is going to be inserted is unique on four of its columns, or if a record is updated, that it is updated to unique values.
```
  Example:
    Record (1,2,3,4) is to be inserted.
    If Record (1,2,3,4) already exists, then do not insert a duplicate record.
    if Record (1,2,3,4) does not exist, then insert it.

    Record (1,2,3,4) is to be updated to (5,6,7,8).
    If Record (5,6,7,8) already exists, then do not update the record. (duplicate record not allowed).
    If Record (5,6,7,8) does not exist, then update the record to the new values.
```
I previously had used a unique index on the record's fields, but would like to learn how a trigger is written to accomplish this.
- Houari almost 9 years
  
  And, did you try something ?
- Str. almost 9 years
  
  Search the web for keywords postgresql and upsert
- a_horse_with_no_name almost 9 years
  
  Why don't you continue to use the unique index? That will be much more efficient
- Alan Wayne almost 9 years
  
  @Houari Tried and tried again ... failed miserably. (Not work reprinting here :)).
- Alan Wayne almost 9 years
  
  @Houari I don't know how (or if ) I could step through the lines to verify the procedure-- i.e., how to debug this?
- Alan Wayne almost 9 years
  
  @a_horse_with_no_name I could, but not sure how to deal with failure on insert/update within a stored procedure, and the fields are very large, so I am guessing the index would become incredibly large on the 100,000 or so records.
- Alan Wayne almost 9 years
  
  @Str "upsert" is interesting but not quit what I was looking for here. Thanks.
- a_horse_with_no_name almost 9 years
  
  100.000 rows is nothing. Don't worry about that number. Not sure what you mean with "the columns are very large" - if you have (four) large character values it seems rather uncommon to impose a unique constraint on them. Maybe if you described your underlying problem (including your real table definition) we can come up with a better solution.
- Alan Wayne almost 9 years
  
  @a_horse_with_no_name Being a newbie here, my columns are about 100 characters wide. Assuming a unique index on 4 such columns, at what point in column widths does an index in PostgreSQL become big enough to slow down insert/update/delete operations? Just curious. Thanks.