why not take a look at postgresql ? (5): high availability ?

July 7, 2012 — Leave a comment

the last post introduced how postgres implements wal archiving and how the archived logs can be used to recover the database. pretty equal to what oracle does. what about replicating these logs to another server and implement some sort of dataguard ? in postgres terms this is called: log shipping. let’s see how it works.

creating a hot standby server ( active dataguard in oracle words )

as oracle does, postgres provides two different approaches for standby servers:

  • transfer the whole wal files one by one to the standby server, which is asynchronous and may lead to data loss. this is called file based log shipping.
  • record based log shipping, which sends the wal records incrementally to the standby server ( streaming )

before deciding to implement one of the above solutions carefully think about the limitations:

  • all database servers involved in the configuration must be on the same architecture: log shipping from 32bit system to 64bit ( and vice versa ) is not possible
  • all databases involved in the configuration must use the same mountpoints. there is not feature like the various *_name_convert parameters as in oracle
  • all databases involved in the configuration must use the same major version of postgres. minor version differences might work but are not officially supported
  • if you want to upgrade to a new minor release, be sure to upgrade the standby database first
  • postgres requires the destination of the archived logs available on the master and standby server ( at least if you do not want downtime of the primary database )

to give you an idea on what’s happening, this is from the official documentation: “In standby mode, the server continuously applies WAL received from the master server. The standby server can read WAL from a WAL archive (see restore_command) or directly from the master over a TCP connection (streaming replication). The standby server will also attempt to restore any WAL found in the standby cluster’s pg_xlog directory. That typically happens after a server restart, when the standby replays again WAL that was streamed from the master before the restart, but you can also manually copy files to pg_xlog at any time to have them replayed.

At startup, the standby begins by restoring all WAL available in the archive location, calling restore_command. Once it reaches the end of WAL available there and restore_command fails, it tries to restore any WAL available in the pg_xlog directory. If that fails, and streaming replication has been configured, the standby tries to connect to the primary server and start streaming WAL from the last valid record found in archive or pg_xlog. If that fails or streaming replication is not configured, or if the connection is later disconnected, the standby goes back to step 1 and tries to restore the file from the archive again. This loop of retries from the archive, pg_xlog, and via streaming replication goes on until the server is stopped or failover is triggered by a trigger file.”

to avoid any connection issues between the master and the standby it may be a goog idea to disable the firewall on both nodes:

service iptables stop
chkconfig iptables off

for getting the extra tools to manage standby databases I additionally installed the contrib package:

yum install postgresql91-contrib

for me, the fastest way to get a standby server was to clone my virtual box vm I used during the previous post. this gave me exactly the same software and os. all steps that follow assume that both nodes are up and running and can connect to each other.

my primary node:
nodename: postgres
ip: 192.168.56.105

my slave node:
nodename: postgres2
ip: 192.168.56.106

as my vm was cloned, I stopped and deleted the database which was running on the clone:

pg_ctl stop -D /opt/postgres/mydb -m fast
[postgres@postgres2 ~]$ rm -rf /opt/postgres/mydb/*
[postgres@postgres2 ~]$ rm -rf /opt/postgres/backups/*

additionally I set up password less ssh connectivity for the postgres user between the nodes:

node postgres:

[postgres@postgres ~]$ ssh-keygen -t rsa
[postgres@postgres ~]$ ssh-copy-id -i .ssh/id_rsa.pub postgres@postgres2

node postgres2:

[postgres@postgres2 ~]$ ssh-keygen -t rsa
[postgres@postgres2 ~]$ ssh-copy-id -i .ssh/id_rsa.pub postgres@postgres2

for simulating a real life scenario I will assume the the primary database is already configured for archiving ( in real world one would archive to a network location accessible to all nodes ). to prepare this on the master:

echo "port=5432
max_connections = 100
shared_buffers = 32MB
log_destination = 'stderr'
logging_collector = on
log_directory = '/opt/postgres/log/'
log_filename = 'postgresql-%a.log'
log_file_mode = 0600
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
temp_tablespaces='temp1,temp2'
wal_level=hot_standby
archive_mode=on
archive_command = 'rsync -a %p postgres@postgres2:/opt/postgres/mydb/arch/%f'
max_wal_senders=5
listen_addresses='*'" > /opt/postgres/mydb/postgresql.conf
pg_ctl stop -D /opt/postgres/mydb -m fast
pg_ctl start -D /opt/postgres/mydb -l /opt/postgres/log/mydb_pgstartup.log

the parameter in question is the archive_command. if you need more information on this parameter check the documentation or the previous post. the reason for posting the whole configuration is, that the steps should be repeatable.

to allow the standby database to connect to the master, add the following line to your masters pg_hba.conf file:

echo "host    replication     all            192.168.56.106/32      md5" >> /opt/postgres/mydb/pg_hba.conf

for enabling this we need to restart the database ( there is no need to restart if archiving is already enabled and you just need to change the parameter. then a reload is sufficient ). if everything works as expected this will be the only restart. for real life scenarios this means: if the master database has wal archiving enabled you do not need to restart the database, thus no downtime will be necessary. if archiving is not enabled, what hopefully is not the case for any production database, you will need to request a short downtime for enabling it.

ok, this is the starting point: primary database running in archive mode, nothing available on the standby server except the archived wal files from the master:

[postgres@postgres2 ~]$ ls -la /opt/postgres/mydb/arch/
total 32776
drwxrwxr-x. 2 postgres postgres     4096 Jun 14 14:25 .
drwxr-xr-x. 6 postgres postgres     4096 Jun 13 13:29 ..
-rw-------. 1 postgres postgres 16777216 Jun 14 14:14 000000030000000000000041
-rw-------. 1 postgres postgres 16777216 Jun 14 14:23 000000030000000000000042
[postgres@postgres2 ~]$ 

now we are ready to clone the master database:

# on the master start the backup and copy the files to the standby database
psql -U sysdba -d postgres -c "SELECT pg_start_backup('backup');"
scp -pr /opt/postgres/mydb/* postgres@postgres2:/opt/postgres/mydb/
# enable recovery and start the standby database
echo "restore_command = 'cp -f /opt/postgres/mydb/arch/%f %p </dev/null'" > /opt/postgres/mydb/recovery.conf
echo "standby_mode = 'on'" >> /opt/postgres/mydb/recovery.conf
echo "primary_conninfo = 'host=192.168.56.105 user=sysdba password=admin123'" >> /opt/postgres/mydb/recovery.conf
echo "hot_standby=on" >> /opt/postgres/mydb/postgresql.conf
rm -f /opt/postgres/mydb/postmaster.pid
# start the standby database
pg_ctl start -D /opt/postgres/mydb -l /opt/postgres/log/mydb_pgstartup.log
# end backup on the primary database
psql -U sysdba -d postgres -c "SELECT pg_stop_backup();"

if everything went fine you should see log-messages in the standby’s logfile like this:

LOG:  streaming replication successfully connected to primary
LOG:  consistent recovery state reached at 0/55000000
LOG:  database system is ready to accept read only connections

you may double check if the standby is really in recovery mode by:

sysdba@[local]:5432/postgres# select * from pg_is_in_recovery();
 pg_is_in_recovery 
-------------------
 t
(1 row)
sysdba@[local]:5432/postgres*# 

of course you want to failover if something happens to your primary database. in postgres you do this by specifying the trigger_file parameter in the recovery.conf file:

echo "trigger_file='/opt/postgres/mydb/failover'" >> /opt/postgres/mydb/recovery.conf

if you need to failover just touch this file. the standby database will notice this and initiate the failover ( of course after applying all the necessary log files ):

touch /opt/postgres/mydb/failover

… which will result in:

LOG:  trigger file found: /opt/postgres/mydb/failover
FATAL:  terminating walreceiver process due to administrator command
...
LOG:  archive recovery complete
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

to further verify if the replication is really working create a table on the master, insert a row and check if this data is available on the standby afterwards:

# primary
psql -U sysdba -d db1
sysdba@[local]:5432/db1# create table t1 ( a int ) ;
sysdba@[local]:5432/db1*# insert into t1 (a) values (1);
INSERT 0 1
sysdba@[local]:5432/db1*# commit;
COMMIT
# check on the standby
psql -d db1 -U sysdba
\d+ t1
                  Table "public.t1"
 Column |  Type   | Modifiers | Storage | Description 
--------+---------+-----------+---------+-------------
 a      | integer |           | plain   | 
Has OIDs: no
sysdba@[local]:5432/db1# select * from t1;
 a 
---
 1
(1 row)

this are the basic steps to create a hot standby with postgres. although postgresql does not have a tool like rman, cloning a database is not a big deal, even if the source database is up and running.

of course you’ll need to carefully review all the parameters and test the setup before going to production.

Advertisements

No Comments

Be the first to start the conversation!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s