23. jan 2016 20:08 UTC
I was asked to provide a replica of a postgresql server running on FreeBSD. The replica was for reporting purposes, so it must be possible to run read-only queries on it, and it needs to be up-to-date at all times. In postgres land this is called a Hot Standby. Almost the same as a warm standby, except that it accepts connections and read-only queries. The replication had to be done to an Ubuntu 14.04.2 LTS machine running on Amazon. Both postgres servers are version 9.3.10. This page on the postgresql wiki helped a lot.
I made the following changes to the master servers postgresql.conf
wal_level = hot_standby max_wal_senders = 3
wal_level determnines the amount of information that goes into the WAL (write-ahead log). The default is "minimal" which has enough information for the master server to recover after an unclean shutdown, but not enough to do replication to another server. Setting this to "hot_standby" enables us to make a hot standby server using this one as a master.
max_wal_senders sets the number of WAL sender processes which controls the number of simultaneous replication clients that can be connected. I only have one replica but the pg_basebackup command used to initialize the standby below requires two connection slots because it uses --xlog-method=stream. Also, max_wal_senders can only be changed when restarting the server, so it is better to set it a bit higher than you need to than setting it too low.
I then add a replication user so the standby can connect to the master.
CREATE ROLE replication WITH REPLICATION PASSWORD 'correcthorsebatterystaple' LOGIN;
I also add an entry for the replication user to pg_hba.conf on the master. Note that the master server already had TLS enabled with "ssl = on" in postgresql.conf, which is a must since I am replicating to a server over the open internet:
# replication to amazon hostssl replication replication 52.18.x.y/32 password
Finally I add a db user called "postgres" on the master. This is because the unix postgres user on Ubuntu is called "postgres" but it is called "pgsql" on FreeBSD. It is convenient to have a db user with the same username as the unix user you are working as so you can do stuff like "psql -l" without having to specify a user (when working on the standby that is).
$ createuser postgres
That is all. The master needs a restart for the wal_* settings to take effect though.
First I try connecting with the replication user I created from the command line just to make sure all firewalls and stuff is all set up properly.
$ psql "host=postgres.example.com user=replication replication=on" Password: psql (9.3.10) SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256) Type "help" for help. replication=> \q $
I am now ready to bootstrap my standby by running pg_basebackup with the connection info used above. The -R switch tells it to create the recovery.conf file automatically for me. The -D switch tells it where to put the files. Ubuntu ran "initdb" for me when I installed postgres so I "rm -r /var/lib/postgresql/9.3/main" first. The --xlog-method=stream switch makes it get all the data needed so I don't have to transport transaction logs manually.
pg_basebackup -D /var/lib/postgresql/9.3/main -R --xlog-method=stream --dbname="host=postgres.blue.tyktech.dk user=replication replication=on"
The resulting files in /var/lib/postgresql/9.3/main is a copy of the master servers data dir. On FreeBSD the postgresql.conf and pg_hba.conf are inside the datadir but on Ubuntu the config files for postgres are in /etc/postgresql/9.3/main/ which means I can delete the following files:
rm /var/lib/postgresql/9.3/main/pg_hba.conf rm /var/lib/postgresql/9.3/main/postgresql.conf
The server is now a "warm standby" - I just need to add "hot_standby = on" to /etc/postgresql/9.3/main/postgresql.conf and it will be a hot standby (the only difference being that a hot standby accepts connections and can run readonly queries where a warm standby cannot). Then I start postgres on the standby and watch the log /var/log/postgresql/postgresql-9.3-main.log:
2016-01-23 19:29:11 UTC LOG: entering standby mode 2016-01-23 19:29:11 UTC LOG: redo starts at 2/EE271510 2016-01-23 19:29:11 UTC LOG: consistent recovery state reached at 2/EE276FB0 2016-01-23 19:29:11 UTC LOG: record with zero length at 2/EE276FB0 2016-01-23 19:29:11 UTC LOG: database system is ready to accept read only connections 2016-01-23 19:29:11 UTC LOG: started streaming WAL from primary at 2/EE000000 on timeline 1 2016-01-23 19:29:11 UTC LOG: incomplete startup packet
The "incomplete startup packet" message can be ignored, the important bit is the "consistent recovery state reached..." bit. I can now connect to the standby and run queries, mission accomplished!
The postgresql documentation is good but postgres has a LOT of different ways to do replication; log shipping, binary replication, a million 3rd party tools and so on. I often get confused as to what parts of the documentation relates to what kinds of replication. Other than that I am not that used to Ubuntu so I was looking at the wrong config file for a while. Other than that it mostly just worked.