Motivation
The new stable branch 2.0 of Slony-I is out for a while now. Time to blog about one of my favorite new features there, cloning an existing node without doing an initial SUBSCRIBE command.
Normally administrators have to perform a subscription for each new node they want to add to their Slony-I cluster. This initial subscription procedure uses PostgreSQL's COPY command to copy over the whole dataset and creating all indexes afterwards. This whole approach can take a fairly long time on bigger slaves, especially if you have many indexes on them.
In PostgreSQL 8.4 there exists a new pg_restore method, parallel restore (see the pg_restore -j command line option). This allows to restore tables and indexes in parallel. If you have machines with more than one cpu core (and number of discs, depending on your storage setup), this can speedup creation of the database, thus making the CLONE PREPARE interesting.
Cloning Procedure
First at all, we have to figure out, which node is the source and which node is the new target node. In Slony-I, a node defines a specific database which takes part in the replication. Consider the following three nodes:
node 1: host=master.db dbname=movies
node 2: host=slave1.db dbname=movies
node 3: host=slave2.db dbname=movies
node 1 is our origin, the master database node in a Slony-I cluster. node 2 and node 3 hosts the slave database node (or subscriber in the Slony-I universe), where node 3 is going to be added to the cluster as a clone of node 2. We assume master.db and slave1.db are already set up with Slony-I.
To start over, we need a small slonik script, which clones the node info of node 2 to our new node. The CLONE PREPARE command copies the node information and adjusts all required node information in the Slony-I catalog (e.g. adding the node to sl_path). Every slonik script needs a preamble, which defines the cluster name and the node connection information described above. To ease this task, i use a small file which can be easily included from other slonik scripts, e.g.
$ cat slonik.inc
define cluster movies;
define conn_master 'host=master.db dbname=movies';
define conn_slave1 'host=slave1.db dbname=movies';
define conn_slave2 'host=slave2.db dbname=movies';
define master 1;
define slave1 2;
define slave2 3;
cluster name = @cluster;
node @master admin conninfo = @conn_master;
node @slave1 admin conninfo = @conn_slave1;
node @slave2 admin conninfo = @conn_slave2;
Any slonik script can use the INCLUDE directive to include the preamble. To clone slave1.db (node id 2) to slave2.db (node id 3) we use the following script then:
$ cat slony_clone_prepare.slonik
include <slonik.inc>;
## prepare cloning slave1.db (identified by node id 2)
echo ' => prepare cloning';
clone prepare (id = @slave2, provider = @slave1, comment = 'clone of slave1.db');
# Wait until all nodes have received the event
wait for event(origin = @slave1, confirmed = ALL, wait on = @slave1);
# Create a SYNC event and wait on every other node until they have confirmed.
# We will do this to prevent the new node to miss any events created
# in the meantime
sync(id=@master);
wait for event(origin=@master, confirmed=@slave1, wait on=@slave1);
echo ' => clone prepared';
echo ' now copy movies@slave1.db to movies@slave2.db';
To execute the script, pass it to slonik:
$ slonik < slony_clone_prepare.slonik
Now we can start and dump the database from node 2 to the new node 3. Create the database on the new node (don't forget to create pl/pgsql on it, it's required for Slony-I). Of course, you need to have Slony-I installed on this node, too. We are going to use three CPU cores to speedup the restore with pg_restore's parallel restore capability. On the new node, we are executing the following command:
$ su - postgres
$ createdb movies && createlang plpgsql movies
$ pg_dump -Fc -h slave1.db movies > movies.pg_dump
$ pg_restore -j3 -h slave2.db -d movies movies.pg_dump
After the database ist restored, we have to finish the clone process with CLONE FINISH command, implemented by the following script:
$ cat slony_clone_finish.slonik
include <slonik.inc>;
## finish cloning of slave1.db
echo ' => try to finish cloning slave1.db';
clone finish(id = @slave2, provider = @slave1);
# store connection path for the new node
store path(server = 1, client = 3, conninfo = @conn_master);
store path(server = 3, client = 1, conninfo = @conn_slave2);
echo ' => finished cloning slave1.db';
$ slonik < slony_clone_finish.slonik
This script calls CLONE FINISH first and stores the connection path for the new node in the cluster configuration. This is important, otherwise the slon daemon won't be able to talk to the new node. Now start the slon daemon for the new node:
$ slon movies 'host=slave2.db dbname=movies'
Depending on the platform you are, you may need to adjust your runlevel scripts to start the slon daemon automatically on each start (at least, don't forget to adjust the runlevel configuration). You also have to take care which user runs the slon daemon, you may have to specify a specific user in the connection string. The new node should immediately starting to catch up with all pending events and replicate all data changes which occured during the cloning procedure.
Conclusion
Cloning a node with CLONE PREPARE has the advantage that it enables the usage of parallel restore to speed up building a new Slony-I slave. It also avoids the additional load on the origin, since all I/O is located on the target and source node only. The new node becomes a complete clone of the source node, thus it replicates exactly the same objects as the source node.