PostgreSQL Replication

6,175

Solution 1

Short answer - there's no such solution yet for PostgreSQL if you need online readonly slaves.

There're two major development projects currently going on in this area which are included in PostgreSQL 9.0 (Spring/Summer 2010), namely:

  • Synchronous Replication:

http://wiki.postgresql.org/wiki/NTT's_Development_Projects

  • Read only hot standby slaves:

http://wiki.postgresql.org/wiki/Hot_Standby

which in combination aim to achieve the ease of use of MySQL-style replication minus the bugs/issues MySQL has plus the reliability users know from PostgreSQL.

All of this was kicked off by a manifest from the PostgreSQL Core Team in 2008:

http://archives.postgresql.org/pgsql-hackers/2008-05/msg00913.php

The PostgreSQL replication solutions to this day with the largest user base are Slony-I (more expensive for writes, makes schema changes fiddly), WAL shipping/walmgr (Slaves can't be used online) and pgQ/londiste from Skype/Skytools (more tools/building blocks than a finished solution).

I've written a few things on Log Shipping, walmgr and Slony-I, see

http://blogs.amd.co.at/mt/mt-search.cgi?blog_id=1&tag=pgrep&limit=20 for more information.

Solution 2

And to throw another solution into the ring: rubyrep.

To compare with your requirements:

  • Easy setup
    Yes, that's actually the primary focus of rubyrep.
  • Simplistic failover
    Yes. In fact rubyrep does master-master replication - to fail over, no action is necessary at all. Just start using the other database.
  • Schema changes don't break replication
    Yes.
    For non-primary key changes replication doesn't even have to stop (but make sure the schema is changes on both sides at the same time)
    To add / remove tables, simply restart the replication daemon. Only changing the primary key column of a table takes a bit of effort.
  • Adding a new database to the server is seamless (i.e. like mysql, you can replicate a whole DB server, so a new database is created on the master, it automatically propagates to the slave)
    This is only supported in a limited way: each rubyrep setup replicates only one database at a time. (But it is very easy to set up replication for more than one database.)

Solution 3

You didn't mention having a hot read-slave as a requirement, so I'm going to propose using Heartbeat with either shared storage or DRBD. It just does the right thing and administration is a breeze. It's the Linux equivalent of older Microsoft SQL Server clustering. One node is active and the other node is passive while the data is shared between the two. You don't have to worry about SQL-based replication because it's all handled lower down at the block level.

Seriously, it's by far the best solution if you don't need read slaves. The WAL archive stuff was hokey at best and you must set everything up again if you ever disrupt the shipping process for a server reboot. slony and londiste don't cut the mustard. If you want to stay on the main source tree and not go commercial, Heartbeat is your best bet.

Solution 4

If you want asynchronous master/slave replication consider Londiste (part of the skytools package from Skype) wiki.postgresql.org/wiki/Londiste_Tutorial

It's easy to install, adding a new DB is easy, replication just "catches up."

Failover is not built-in though. You will need to change your application connection strings or obfuscate the DB connection behind another layer of software.

Some schema changes are easy. Others are more difficult. It depends on your application. The next version of skytools (ver 3.0) is supposed to handle DDL and include facilities to make failover easier.

We moved to Londiste after finding Slony too painful to use.

Solution 5

From your requirements it seems that PITR is easiest way to solve your problem:

On-line backup and point-in-time recovery (PITR)

You didn't say that you need to query slave server, so PITR might be just right.

It's standard part of PostgreSQL from version 8.0 so you probably already have everything needed to get it up and running.

If you find instructions too verbose, take a look at SkyTools WalMgr which will make process of creating/failover to hot-standby data single command task.

For more complex replication scenarios, I had good experience Slony-1, but PostgreSQL has many good replication/HA options available.

Share:
6,175

Related videos on Youtube

f4nt
Author by

f4nt

Updated on September 17, 2022

Comments

  • f4nt
    f4nt over 1 year

    We constantly bat this around the office, and the question continues to come up. How do you deal with PostgreSQL replication? I'm not even necessarily talking about advanced clusters, just keeping it simple with Master-Slave, Master-MultiSlave, and Master-Master. I find that setting it up for MySQL is typically pretty simple. Failover is straightforward if not perfect, especially for how easy it is to configure. We've played with Slony, but it's a bit too hands on (schema changes require intervention, new databases require intervention, etc). PGPool2 was pretty nice, until a node went down and we couldn't find a graceful way (other than bringing everything down and reseeding the fallen node) to get replication back in sync. Basically here's what I'm typically looking for:

    • Easy setup (I'll settle for difficult setup, but easy to expand)
    • Simplistic failover
    • Bringing a fallen node back in just requires time (i.e. like mysql. Server goes down, you bring it up, and wait for replication to catch up)
    • Schema changes don't break replication
    • Adding a new database to the server is seamless (i.e. like mysql, you can replicate a whole DB server, so a new database is created on the master, it automatically propagates to the slave)

    MySQL handles most of these fairly well, but I hold a certain fondness for PostgreSQL. Besides, we have some situations where it's our only option, and we'd like to add replication to the mix. What are you using currently, and how do you feel about your solution? This isn't a MySQL versus PostgreSQL post, I promise, because that's not what I'm trying to start. :)

    • Dave Cheney
      Dave Cheney almost 15 years
      I'm interested in the answer to this. Coming from a MySQL background, the replication options for PSQL are agricultural to say the least.
    • f4nt
      f4nt almost 15 years
      Yeah, so far every option I've played with has had significant drawbacks. Hoping I'm missing something obvious.. but I don't think I am
    • Vinko Vrsalovic
      Vinko Vrsalovic almost 15 years
      I suspect there is nothing else, but I'm eager for somebody to prove me wrong
    • Vinko Vrsalovic
      Vinko Vrsalovic almost 15 years
      BTW, have you tried [email protected]?
  • Dave Cheney
    Dave Cheney almost 15 years
    and those options are ... ?
  • dpavlin
    dpavlin almost 15 years
    ...listed in blog post blog.endpoint.com/2009/05/competitors-to-bucardo-version-1.h‌​tml referenced in one of answers...
  • dpavlin
    dpavlin almost 15 years
    this answer is duplicate of PITR suggestion, since PITR uses WAL :-)
  • user1871402
    user1871402 over 12 years
    Synchronous Replication + Hot Standby are now available - see wiki.postgresql.org/wiki/… for a good summary of the available techniques