PostgreSQL Replication and Load Balancing with pgpool-II

August 28, 2015

Awhile back I setup native postgresql replication and load balancing using pgpool-II. I used a variety of sources, but never found anything with every step in one place. I wanted to document what I did in hopes it helps someone else.

Our goal is to have a master database (10.0.0.1) which replicates everything in real time to a slave database (10.0.0.2) and a client machine (10.0.0.3) that will automatically spread reads between the master and the slave while sending all writes to just the master.

I used Debian 7, PostgreSQL 9.4.4, and pgpool-II 3.3.4. I also use Ansible to do all the work so I apoligize if my instructions aren’t directly runnable via the command line.

Before You Start

Spend the time to double check the commands and read up on the fields and settings. Try it out somewhere other than production. Test. Test. Test.

I may well have missed something while documenting. Hope not, but you never know. Leave a comment and I’ll see what I can do about fixing it.

Getting Started

On all three servers add following APT repository. This will ensure you pick up the latest and greatest postgresql packages.

deb http://apt.postgresql.org/pub/repos/apt/ wheezy-pgdg main

The Master Database

  1. Install the postgresql-9.4 package.

  2. Create a pgpool database user.

  3. Update /etc/postgresql/9.4/main/postgresql.conf:

    Note that I am only addressing settings necessary for replication. You may well want to tune other parameters to suit your application.

    • Make sure the server is available over the internet:

      listen_addresses = 'localhost,10.0.0.1'
    • Ensure the master sends enough data for the slave to do replication:

      wal_level = hot_standby
    • Set the number of replication slots. This is the maximum number of slaves that can be connected. Even though I only have one slave I set it to three in case I want to up it later I don’t have to restart the master.

      max_replication_slots = 3
    • Set the number of concurrent connections from standby servers. I chose to make it the same as number of replication slots (ie. one per slave).

      max_wal_senders = 3
  4. Update /etc/postgresql/9.4/main/pg_hba.conf to look like this:

     local   all           postgres                 peer
     local   all           all                      peer
     host    all           all        127.0.0.1/32  md5
     host    all           all        ::1/128       md5
     host    replication   all        10.0.0.2/32   trust   # the slave
     host    postgres      pgpool     10.0.0.3/32   trust   # the client
     host    all           all        10.0.0.3/32   md5     # the client
    • The first four lines are the default configuration.

    • The ‘replication’ line allows the slave to connect to the master for replication purposes.

    • The ‘pgpool’ line is required because pgpool must be able to connect to the database to check the status (such as if this is a master or slave database). It needs very little access so can be restricted to just the postgres database. More on this later.

    • The last line requires that any other connection from the client is md5 authenticated.

  5. Run the following SQL to create the replication slot for the slave. The choice of theslave is somewhat arbitrary, but should probably match the hostname of the slave itself to help your sanity later.

     select * from pg_create_physical_replication_slot('theslave')
  6. Restart postgresql.

The Slave Database

  1. Install the postgresql-9.4 package.

  2. Update /etc/postgresql/9.4/main/postgresql.conf:

    Note that I am only addressing settings necessary for replication. You may well want to tune other parameters to suit your application.

    • Make sure the server is available over the internet:

      listen_addresses = 'localhost,10.0.0.2'
    • Ensure the slave can respond to read only queries:

      wal_level = hot_standby
      hot_standby = on
    • Prevent query conflicts:

      hot_standby_feedback = on
  3. Update /etc/postgresql/9.4/main/pg_hba.conf to look like this:

     local   all           postgres                 peer
     local   all           all                      peer
     host    all           all        127.0.0.1/32  md5
     host    all           all        ::1/128       md5
     host    postgres      pgpool     10.0.0.3/32   trust   # the client
     host    all           all        10.0.0.3/32   md5     # the client
    • The first four lines are the default configuration.

    • The ‘pgpool’ line is required because pgpool must be able to connect to the database to check the status (such as if this is a master or slave database). It needs very little access so can be restricted to just the postgres database. More on this later.

    • The last line requires that any other connection from the client is md5 authenticated.

  4. Create a pgpool database user.

  5. Stop postgresql.

  6. As the postgres user run the following command:

     cd ~/9.4
     rm -rf main
     pg_basebackup -v -D main -R -P -h 10.0.0.1

    Once that completes make sure that ~/9.4/main/recovery.conf looks very similar to this. Specifically that the slot name matches the name used above when creating the replication slot.

     standby_mode = 'on'
     primary_slot_name = 'theslave'
     primary_conninfo = 'user=postgres host=10.0.0.1 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
  7. Start postgresql.

The Client Machine

  1. Install the postgresql-client-9.4 and pgpool2 packages.

  2. Update /etc/pgpool2/pgpool.conf:

    • As this is the client machine only listen on the local interface:

      listen_addresses = ''
    • Set the port so PG clients will connect by default:

      port = 5432
    • Set the backends. The first must be the master. At this point we’re explicitly disabling failover. You can experiment with different weights to distribute the load.

      backend_hostname0 = '10.0.0.1'
      backend_port0 = 5432
      backend_weight0 = 1
      backend_flag0 = 'DISALLOW_TO_FAILOVER'
      
      backend_hostname1 = '10.0.0.2
      backend_port1 = 5432
      backend_weight1 = 1
      backend_flag1 = 'DISALLOW_TO_FAILOVER'
    • Use client authentication. Note this is pgpool authentication.

      enable_pool_hba = on
      pool_passwd = 'pool_password'
    • I personally disabled connection caching, but feel free to play with it:

      connection_cache = off
    • Disable replication:

      replication_mode = off
    • Disable replicated select queries:

      replicate_select = off
    • Enable load balancing:

      load_balance_mode = on
    • Enable master/slave mode and set mode to ‘stream’ (as opposed to say ‘slony’):

      master_slave_mode = on
      master_slave_sub_mode = 'stream'
    • Set the replication check and heart beat users to connect as. Note that this is mandatory even if you disable replication checks. Also note that this is the same pgpool we specified in the pg_hba.conf files above and why we had to create the pgpool user at all. If you get weird connection failures, this is the likely candidate.

      sr_check_user = 'pgpool'
      wd_lifecheck_user = 'pgpool'
  3. Update /etc/pgpool2/pg_hba.conf to include only this line so that all connections are authenticated.

     local all all md5
  4. Add any user that needs to connect from this client through pgpool to the database with the same password used in the master database to /etc/pgpool/pool_password. Here we assume our user is foo and password is bar.

     % pg_md5 -u foo -m bar
  5. Ensure that /etc/pgpool/pool_password is owned by postgres, has group postgres and permissions 0600.

  6. Restart pgpool.

Finishing Up

  1. Test.

  2. Test.

  3. Test.

What Next?

  1. Configure PostgreSQL replication failover with pgpool notification.

  2. Questions, corrections? Comment below.