Awhile back I setup native postgresql replication and load balancing using pgpool-II. I used a variety of sources, but never found anything with every step in one place. I wanted to document what I did in hopes it helps someone else.
Our goal is to have a master database (10.0.0.1
) which
replicates everything in real time to a slave database (10.0.0.2
)
and a client machine (10.0.0.3
) that will automatically spread
reads between the master and the slave while sending all writes to just
the master.
I used Debian 7, PostgreSQL 9.4.4, and pgpool-II 3.3.4. I also use Ansible to do all the work so I apoligize if my instructions aren't directly runnable via the command line.
Before You Start
Spend the time to double check the commands and read up on the fields and settings. Try it out somewhere other than production. Test. Test. Test.
I may well have missed something while documenting. Hope not, but you never know. Leave a comment and I'll see what I can do about fixing it.
Getting Started
On all three servers add following APT repository. This will ensure you pick up the latest and greatest postgresql packages.
deb http://apt.postgresql.org/pub/repos/apt/ wheezy-pgdg main
The Master Database
-
Install the
postgresql-9.4
package. -
Create a
pgpool
database user. -
Update
/etc/postgresql/9.4/main/postgresql.conf
:Note that I am only addressing settings necessary for replication. You may well want to tune other parameters to suit your application.
-
Make sure the server is available over the internet:
listen_addresses = 'localhost,10.0.0.1'
-
Ensure the master sends enough data for the slave to do replication:
wal_level = hot_standby
-
Set the number of replication slots. This is the maximum number of slaves that can be connected. Even though I only have one slave I set it to three in case I want to up it later I don't have to restart the master.
max_replication_slots = 3
-
Set the number of concurrent connections from standby servers. I chose to make it the same as number of replication slots (ie. one per slave).
max_wal_senders = 3
-
-
Update
/etc/postgresql/9.4/main/pg_hba.conf
to look like this:local all postgres peer local all all peer host all all 127.0.0.1/32 md5 host all all ::1/128 md5 host replication all 10.0.0.2/32 trust # the slave host postgres pgpool 10.0.0.3/32 trust # the client host all all 10.0.0.3/32 md5 # the client
-
The first four lines are the default configuration.
-
The 'replication' line allows the slave to connect to the master for replication purposes.
-
The 'pgpool' line is required because pgpool must be able to connect to the database to check the status (such as if this is a master or slave database). It needs very little access so can be restricted to just the
postgres
database. More on this later. -
The last line requires that any other connection from the client is md5 authenticated.
-
-
Run the following SQL to create the replication slot for the slave. The choice of
theslave
is somewhat arbitrary, but should probably match the hostname of the slave itself to help your sanity later.select * from pg_create_physical_replication_slot('theslave')
-
Restart postgresql.
The Slave Database
-
Install the
postgresql-9.4
package. -
Update
/etc/postgresql/9.4/main/postgresql.conf
:Note that I am only addressing settings necessary for replication. You may well want to tune other parameters to suit your application.
-
Make sure the server is available over the internet:
listen_addresses = 'localhost,10.0.0.2'
-
Ensure the slave can respond to read only queries:
wal_level = hot_standby hot_standby = on
-
Prevent query conflicts:
hot_standby_feedback = on
-
-
Update
/etc/postgresql/9.4/main/pg_hba.conf
to look like this:local all postgres peer local all all peer host all all 127.0.0.1/32 md5 host all all ::1/128 md5 host postgres pgpool 10.0.0.3/32 trust # the client host all all 10.0.0.3/32 md5 # the client
-
The first four lines are the default configuration.
-
The 'pgpool' line is required because pgpool must be able to connect to the database to check the status (such as if this is a master or slave database). It needs very little access so can be restricted to just the
postgres
database. More on this later. -
The last line requires that any other connection from the client is md5 authenticated.
-
-
Create a
pgpool
database user. -
Stop postgresql.
-
As the
postgres
user run the following command:cd ~/9.4 rm -rf main pg_basebackup -v -D main -R -P -h 10.0.0.1
Once that completes make sure that
~/9.4/main/recovery.conf
looks very similar to this. Specifically that the slot name matches the name used above when creating the replication slot.standby_mode = 'on' primary_slot_name = 'theslave' primary_conninfo = 'user=postgres host=10.0.0.1 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
-
Start postgresql.
The Client Machine
-
Install the
postgresql-client-9.4
andpgpool2
packages. -
Update
/etc/pgpool2/pgpool.conf
:-
As this is the client machine only listen on the local interface:
listen_addresses = ''
-
Set the port so PG clients will connect by default:
port = 5432
-
Set the backends. The first must be the master. At this point we're explicitly disabling failover. You can experiment with different weights to distribute the load.
backend_hostname0 = '10.0.0.1' backend_port0 = 5432 backend_weight0 = 1 backend_flag0 = 'DISALLOW_TO_FAILOVER' backend_hostname1 = '10.0.0.2 backend_port1 = 5432 backend_weight1 = 1 backend_flag1 = 'DISALLOW_TO_FAILOVER'
-
Use client authentication. Note this is pgpool authentication.
enable_pool_hba = on pool_passwd = 'pool_password'
-
I personally disabled connection caching, but feel free to play with it:
connection_cache = off
-
Disable replication:
replication_mode = off
-
Disable replicated select queries:
replicate_select = off
-
Enable load balancing:
load_balance_mode = on
-
Enable master/slave mode and set mode to 'stream' (as opposed to say 'slony'):
master_slave_mode = on master_slave_sub_mode = 'stream'
-
Set the replication check and heart beat users to connect as. Note that this is mandatory even if you disable replication checks. Also note that this is the same
pgpool
we specified in thepg_hba.conf
files above and why we had to create thepgpool
user at all. If you get weird connection failures, this is the likely candidate.sr_check_user = 'pgpool' wd_lifecheck_user = 'pgpool'
-
-
Update
/etc/pgpool2/pg_hba.conf
to include only this line so that all connections are authenticated.local all all md5
-
Add any user that needs to connect from this client through pgpool to the database with the same password used in the master database to
/etc/pgpool/pool_password
. Here we assume our user isfoo
and password isbar
.% pg_md5 -u foo -m bar
-
Ensure that
/etc/pgpool/pool_password
is owned bypostgres
, has grouppostgres
and permissions0600
. -
Restart pgpool.
Finishing Up
-
Test.
-
Test.
-
Test.
What Next?
-
Configure PostgreSQL replication failover with pgpool notification.
-
Questions, corrections? Comment below.