How to duplicate schemas in PostgreSQL

58,421

Solution 1

You can probably do it from the command line without using files:

pg_dump -U user --schema='fromschema' database | sed 's/fromschmea/toschema/g' | psql -U user -d database

Note that this searches and replaces all occurrences of the string that is your schema name, so it may affect your data.

Solution 2

I would use pg_dump to dump the schema without data:

-s
--schema-only

Dump only the object definitions (schema), not data.

This option is the inverse of --data-only. It is similar to, but for historical reasons not identical to, specifying --section=pre-data --section=post-data.

(Do not confuse this with the --schema option, which uses the word "schema" in a different meaning.)

To exclude table data for only a subset of tables in the database, see --exclude-table-data.

pg_dump $DB -p $PORT -n $SCHEMA -s -f filename.pgsql

Then rename the schema in the dump (search & replace) and restore it with psql.

psql $DB -f filename.pgsql

Foreign key constraints referencing tables in other schemas are copied to point to the same schema.
References to tables within the same schema point to the respective tables within the copied schema.

Solution 3

I will share a solution for my problem which was the same with a small addition. I needed to clone a schema, create a new database user and assign ownership of all objects in the new schema to that user.

For the following example let's assume that the reference schema is called ref_schema and the target schema new_schema. The reference schema and all the objects within are owned by a user called ref_user.

1. dump the reference schema with pg_dump:

pg_dump -n ref_schema -f dump.sql database_name

2. create a new database user with the name new_user:

CREATE USER new_user

3. rename the schema ref_schema to new_schema:

ALTER SCHEMA ref_schema RENAME TO new_schema

4. change ownership of all objects in the renamed schema to the new user

REASSIGN OWNED BY ref_user TO new_user

5. restore the original reference schema from the dump

psql -f dump.sql database_name

I hope someone finds this helpful.

Solution 4

A bit late to the party but, some sql here could help you along your way:

get schema oid:

namespace_id = SELECT oid 
                  FROM pg_namespace 
                 WHERE nspname = '<schema name>';

get table's oid:

table_id = SELECT relfilenode 
                FROM pg_class 
               WHERE relnamespace = '<namespace_id>' AND relname = '<table_name>'

get foreign key constraints:

SELECT con.conname, pg_catalog.pg_get_constraintdef(con.oid) AS condef 
  FROM pg_catalog.pg_constraint AS con 
  JOIN pg_class AS cl ON cl.relnamespace = con.connamespace AND cl.relfilenode = con.conrelid 
 WHERE con.conrelid = '<table_relid>'::pg_catalog.oid AND con.contype = 'f';

A good resource for PostgreSQL system tables can be found here. Additionally, you can learn more about the internal queries pg_dump makes to gather dump information by viewing it's source code.

Probably the easiest way to see how pg_dump gathers all your data would be to use strace on it, like so:

$ strace -f -e sendto -s8192 -o pg_dump.trace pg_dump -s -n <schema>
$ grep -oP '(SET|SELECT)\s.+(?=\\0)' pg_dump.trace

You'll still have to sort through the morass of statements but, it should help you piece together a cloning tool programmatically and avoid having to drop to a shell to invoke pg_dump.

Share:
58,421
Cristhian Boujon
Author by

Cristhian Boujon

Updated on March 18, 2021

Comments

  • Cristhian Boujon
    Cristhian Boujon about 3 years

    I have a database with schema public and schema_A. I need to create a new schema schema_b with the same structure than schema_a. I found the function below, the problem is that it does not copy the foreign key constraints.

    CREATE OR REPLACE FUNCTION clone_schema(source_schema text, dest_schema text)
      RETURNS void AS
    $BODY$
    DECLARE
      object text;
      buffer text;
      default_ text;
      column_ text;
    BEGIN
      EXECUTE 'CREATE SCHEMA ' || dest_schema ;
    
      -- TODO: Find a way to make this sequence's owner is the correct table.
      FOR object IN
        SELECT sequence_name::text FROM information_schema.SEQUENCES WHERE sequence_schema = source_schema
      LOOP
        EXECUTE 'CREATE SEQUENCE ' || dest_schema || '.' || object;
      END LOOP;
    
      FOR object IN
        SELECT table_name::text FROM information_schema.TABLES WHERE table_schema = source_schema
      LOOP
        buffer := dest_schema || '.' || object;
        EXECUTE 'CREATE TABLE ' || buffer || ' (LIKE ' || source_schema || '.' || object || ' INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING DEFAULTS)';
    
        FOR column_, default_ IN
          SELECT column_name::text, REPLACE(column_default::text, source_schema, dest_schema) FROM information_schema.COLUMNS WHERE table_schema = dest_schema AND table_name = object AND column_default LIKE 'nextval(%' || source_schema || '%::regclass)'
        LOOP
          EXECUTE 'ALTER TABLE ' || buffer || ' ALTER COLUMN ' || column_ || ' SET DEFAULT ' || default_;
        END LOOP;
      END LOOP;
    
    END;
    $BODY$  LANGUAGE plpgsql
    

    How can I clone/copy schema_A with the foreign key constraints?

    • IdanDavidi
      IdanDavidi about 6 years
      If you want to clone the schema with SQL query, check out this answer.
  • Cristhian Boujon
    Cristhian Boujon over 10 years
    Yes, thank you! but I wanted to avoid working with files. I'm looking for a quick way to do it.
  • rabbitt
    rabbitt over 9 years
    Here's a python class I just wrote up to perform a schema clone (including data copy) using all SQL and no shelling out to pg_dump: https://gist.github.com/rabbitt/97f2c048d9e38c16ce62
  • dani24
    dani24 almost 6 years
    How can I avoid to affect the data?
  • Ratan Uday Kumar
    Ratan Uday Kumar over 5 years
    i need to copy without data. What is the solution. Need only structures
  • Eric
    Eric over 3 years
    @dani24 You can dump to a file and then use grep -E "^[[:digit:]]+.*fromschema.*" dumpfile | wc -l to count the occurrences of the schema name in the data rows. If the result is greater than 0, you have data that contains the 'fromschema' string.