Log shipping?!

PostgreSQL offers support for "shipping" it's WAL, the Write Ahead Log, where the changes of every transaction are recorded, to other database systems. The other database system then reads the changes from the WAL file and applies the changes to it's local data store.

Log shipping has the drawback that the slave servers can't be used for queries as long as they are replicating data and cannot be put back in replication after they've been taken online. Additionally the replication isn't very granular, PostgreSQL natively itself will accept only completed WAL files.

On the other hand this mechanism is very efficient and very reliable since the WAL is at the core of normal PostgreSQL operation.

The WAL

The WAL files of a PostgreSQL database can be found under $PGDATA/pg_xlog, in Debian $PGDATA is usually /var/lib/postgresql/<VERSION>/<CLUSTERNAME>. Every WAL segment is 16MiB in size (compile-time default) and it's name consists of three separate counters:

Naming

If we take the name "00000001000000030000008E" it tells us that the timeline of the file is "1", that it belongs to the logical log file (logid) "3" and that it's the 142th (0x8E) segment of the given logfile.

The segment counter increments with every segment switch, the logical log file is incremented (and the segment counter reset to 0) whenever a new segment would overflow the 32bit address space (or "4GiB") of a logical logfile. With a standard segment size of 16MiB this happens every 255 segments.

Switching segments

A WAL segment gets switched when one of the following things happen:

it's full (16MiB worth of changes have been written)
archive_timeout is exceeded
pg_switch_xlog is called

Replicating

The mechanism used for reading in WAL files on a slave server is very close to the mechanism that is used when PostgreSQL recovers from an unclean shutdown:

The daemon doesn't know in what state the heap files (tables, indexes, etc.) are and therefore consults the WAL, where changes of every transaction are written to, replaying every transaction since the last CHECKPOINT.

Because the same code-infrastructure is used, the replaying of WAL files is called "recovery mode".

Shipping the files

PostgreSQL has an archive_command parameter which can be used to configure a command which gets called after every segment switch. This makes it easy to copy completed WAL segments from the master server to a remote system with various mechanisms, e.g. nfs, scp, rsync, etc.

Recovering

To configure a server for recovery you need to place a file named "recovery.conf" into it's $PGDATA directory. A sample recovery.conf might look something like this:

restore_command = '/usr/lib/postgresql/8.3/bin/pg_standby -l -t /var/lib/postgresql/logship.trigger /srv/logship-archive %f %p'
log_restartpoints = 'true'

# for PITR
#recovery_target_time = '2009-04-21 19:00:00'

Additionally the server needs a consistent backup in it's $PGDATA directory and access to all WAL files that have been written since the backup.

When started in recovery mode, PostgreSQL will replay WAL files until the program referenced in restore_command returns. After that it will take the database online, increment the timeline counter of the WAL file and effectively prevent that the current database can be used as target for recovery again. This is necessary, because modifications can happen to the tables as soon as the database is taken online.

pg_standby

pg_standby is a contrib tool that watches a given directory for new WAL files and makes these available to PostgreSQL via copying/linking the given files into it's pg_xlog directory.

When using pg_standby there are two main mechanisms for ending replication:

"Pulling the trigger", meaning: creating the specified trigger file
Feeding an incomplete WAL-file: Imagine a crashed server that doesn't boot anymore: if you could salvage the active WAL segment and copy it to the recovery server, PostgreSQL will notice that the WAL segment is incomplete and perform it's normal startup procedure as well as incrementing the timeline.

Resuming recovery

After a slave server has been taken online (and it's timeline was switched) you must copy a backup from the master server and create a new recovery.conf to resume log shipping operation.

Doing it all

Now that we know what to do and how these things work, let's break a few things!

Preparing

Preparing the environments should be rather easy, first make sure, that your machines are setup correctly.

When both machines are running, run the following command:

master1:~/pgworkshop# ./envorcer logship

This creates a cluster named "logship" on both servers, creates a database for pgexerciser on master1 and installs it's schema to the database.

Additionally, it creates a directory on slave1 where the WAL files will be copied to, enables archive_mode among a few other settings on master1 and copies a base backup of the database & an appropriate recovery.conf to slave1.

Doing some transactions

Start the databases on both servers with pg_ctlcluster and run pgexerciser (no arguments needed) on master1.

archive_timeout is set to 60 seconds, so a logswitch should occur every minute. This can be monitored in a few places:

The "archiver" process on master1 and the "startup" process on slave1 will show in their processtitle what WAL file they have handled or are expecting next
PostgreSQL also keeps track of which files have already been copied on master1 in $PGDATA/pg_xlog/archive_status
The PostgreSQL logfile on slave1 (found in /var/log/postgresql/) will show when the WAL files have been processed

Breaking stuff

Now it's up to you. You could either create the trigger file pg_standby watches, "killall -9 postgres" on the master and copy over the active WAL segment or try a PITR (Point in time recovery)

Resuming recovery, this time for real

After you took the slave online, use the following steps to get back into recovery mode:

slave1:~# killall -9 postgres
master1:~# psql postgres -c "select pg_start_backup('foo')"
master1:~# rsync -avH --delete --delete-excluded --exclude pg_xlog/*  /var/lib/postgresql/8.3/logship/ root@slave1:/var/lib/postgresql/8.3/logship
master1:~# psql postgres -c "select pg_stop_backup()"
master1:~# scp pgworkshop/configs/logship/recovery.conf root@slave1:/var/lib/postgresql/8.3/logship/

When you start the PostgreSQL cluster on slave1 again, it should start in recovery mode again. More on backing up PostgreSQL databases can be found in the documentation.

Testing PostgreSQL replication solutions: Log shipping with pg_standby

Log shipping?!

The WAL

Naming

Switching segments

Replicating

Shipping the files

Recovering

pg_standby

Resuming recovery

Doing it all

Preparing

Doing some transactions

Breaking stuff

Resuming recovery, this time for real

Tags:

No TrackBacks

Search

About this Entry

Categories

Monthly Archives

Pages