Dienstag, 22. September 2009

Cloning Slony Nodes

Motivation


The new stable branch 2.0 of Slony-I is out for a while now. Time to blog about one of my favorite new features there, cloning an existing node without doing an initial SUBSCRIBE command.

Normally administrators have to perform a subscription for each new node they want to add to their Slony-I cluster. This initial subscription procedure uses PostgreSQL's COPY command to copy over the whole dataset and creating all indexes afterwards. This whole approach can take a fairly long time on bigger slaves, especially if you have many indexes on them.

In PostgreSQL 8.4 there exists a new pg_restore method, parallel restore (see the pg_restore -j command line option). This allows to restore tables and indexes in parallel. If you have machines with more than one cpu core (and number of discs, depending on your storage setup), this can speedup creation of the database, thus making the CLONE PREPARE interesting.

Cloning Procedure


First at all, we have to figure out, which node is the source and which node is the new target node. In Slony-I, a node defines a specific database which takes part in the replication. Consider the following three nodes:

node 1: host=master.db dbname=movies
node 2: host=slave1.db dbname=movies
node 3: host=slave2.db dbname=movies

node 1 is our origin, the master database node in a Slony-I cluster. node 2 and node 3 hosts the slave database node (or subscriber in the Slony-I universe), where node 3 is going to be added to the cluster as a clone of node 2. We assume master.db and slave1.db are already set up with Slony-I.

To start over, we need a small slonik script, which clones the node info of node 2 to our new node. The CLONE PREPARE command copies the node information and adjusts all required node information in the Slony-I catalog (e.g. adding the node to sl_path). Every slonik script needs a preamble, which defines the cluster name and the node connection information described above. To ease this task, i use a small file which can be easily included from other slonik scripts, e.g.

$ cat slonik.inc
define cluster movies;
define conn_master 'host=master.db dbname=movies';
define conn_slave1 'host=slave1.db dbname=movies';
define conn_slave2 'host=slave2.db dbname=movies';
define master 1;
define slave1 2;
define slave2 3;

cluster name = @cluster;

node @master admin conninfo = @conn_master;
node @slave1 admin conninfo = @conn_slave1;
node @slave2 admin conninfo = @conn_slave2;

Any slonik script can use the INCLUDE directive to include the preamble. To clone slave1.db (node id 2) to slave2.db (node id 3) we use the following script then:

$ cat slony_clone_prepare.slonik

include <slonik.inc>;

## prepare cloning slave1.db (identified by node id 2)

echo ' => prepare cloning';

clone prepare (id = @slave2, provider = @slave1, comment = 'clone of slave1.db');

# Wait until all nodes have received the event
wait for event(origin = @slave1, confirmed = ALL, wait on = @slave1);

# Create a SYNC event and wait on every other node until they have confirmed.
# We will do this to prevent the new node to miss any events created
# in the meantime

sync(id=@master);
wait for event(origin=@master, confirmed=@slave1, wait on=@slave1);

echo ' => clone prepared';
echo ' now copy movies@slave1.db to movies@slave2.db';

To execute the script, pass it to slonik:

$ slonik < slony_clone_prepare.slonik

Now we can start and dump the database from node 2 to the new node 3. Create the database on the new node (don't forget to create pl/pgsql on it, it's required for Slony-I). Of course, you need to have Slony-I installed on this node, too. We are going to use three CPU cores to speedup the restore with pg_restore's parallel restore capability. On the new node, we are executing the following command:

$ su - postgres
$ createdb movies && createlang plpgsql movies
$ pg_dump -Fc -h slave1.db movies > movies.pg_dump
$ pg_restore -j3 -h slave2.db -d movies movies.pg_dump

After the database ist restored, we have to finish the clone process with CLONE FINISH command, implemented by the following script:

$ cat slony_clone_finish.slonik

include <slonik.inc>;

## finish cloning of slave1.db
echo ' => try to finish cloning slave1.db';
clone finish(id = @slave2, provider = @slave1);

# store connection path for the new node
store path(server = 1, client = 3, conninfo = @conn_master);
store path(server = 3, client = 1, conninfo = @conn_slave2);

echo ' => finished cloning slave1.db';

$ slonik < slony_clone_finish.slonik

This script calls CLONE FINISH first and stores the connection path for the new node in the cluster configuration. This is important, otherwise the slon daemon won't be able to talk to the new node. Now start the slon daemon for the new node:

$ slon movies 'host=slave2.db dbname=movies'

Depending on the platform you are, you may need to adjust your runlevel scripts to start the slon daemon automatically on each start (at least, don't forget to adjust the runlevel configuration). You also have to take care which user runs the slon daemon, you may have to specify a specific user in the connection string. The new node should immediately starting to catch up with all pending events and replicate all data changes which occured during the cloning procedure.

Conclusion


Cloning a node with CLONE PREPARE has the advantage that it enables the usage of parallel restore to speed up building a new Slony-I slave. It also avoids the additional load on the origin, since all I/O is located on the target and source node only. The new node becomes a complete clone of the source node, thus it replicates exactly the same objects as the source node.

Freitag, 4. September 2009

Building PostgreSQL Documentation on MacOS X

During the latest CommitFest i've started to use MacOS X while reviewing patches as a Round Robin Reviewer. I'm going to love OS X and its shiny GUI, but i had heavy problems to build the PostgreSQL Documentation on that platform. The core problem is, that on OS X there's no really good Docbook distribution available (at least, i haven't found one, which is usable out of the box). So i decided to go with macports, since i had some very good experience with them in the past. Please note that i'm not a DocBook hacker...i'm just using it to build at least the PostgreSQL docs. So i've started installing openjade from macports first:

$ sudo port install openjade

This installs openjade into /opt/local/share/sgml/openjade. Unfortunately, there's no Docbook SGML in macports, so i followed the PostgreSQL instructions and downloaded the following packages:

DocBook SGML DTD
DocBook DSSSL
ISO 8879 Entities

The ISO 8879 character entities can also be obtained from macports:

$ sudo port install iso8879

This is what I preferred. I went on and installed the DocBook DTD and DocBook DSSSL packages as described in the PostgreSQL Documentation. Instead of /usr/local/share/sgml i've used the already created directory /opt/local/share/sgml, used by macports. Unzipping the downloaded archives gives the following layout:

drwxr-xr-x 36 root admin 1,2K 3 Sep 23:45 docbook/
drwxr-xr-x 21 root admin 714B 3 Sep 23:34 docbook-dsssl-1.79/
drwxr-xr-x 22 root admin 748B 3 Sep 14:10 iso8879/
drwxr-xr-x 12 root admin 408B 4 Sep 10:24 openjade/

Because the ISO character entities are located outside the docbook directories, I needed to adjust the catalog file /opt/local/share/sgml/docbook/docbook.cat. I dumped the whole entity section from the file and added a CATALOG reference instead:

$ cat docbook/docbook.cat
[...]
-- ...................................................................... --
-- ISO entity sets ...................................................... --
CATALOG "../iso8879/catalog"

This references /opt/local/share/sgml/iso8879/catalog which includes all ISO character entities then. Afterwards i tried PostgreSQL configure, unfortunately that didn't work. The first problem i've encountered was that /opt/local/bin/collateindex.pl wasn't installed with executable rights:

-rw-r--r-- 2 root admin 17503 3 Sep 15:21 collateindex.pl

configure was not able to locate the script (so, what's the point in installing a script in a bin directory without executable flags??). Firing a chmod a+x /opt/local/bin/collateindex.pl fixed that issue. Another problem was, that openjade wasn't able to find the docbook DTD's. After trying around a little bit, i realized that openjade looks into /opt/local/share/sgml, not in /opt/local/share/sgml/openjade, as someone might expect. I created a symlink to openjade/catalog. At a first try this didn't work either. Foolish as i am, i forgot that the catalog file references the objects with relative paths. Creating a separate catalog file in /opt/local/share/sgml/catalog did the job finally:

$ cat /opt/local/share/sgml/catalog
PUBLIC "-//James Clark//DTD DSSSL Flow Object Tree//EN" "openjade/fot.dtd"
PUBLIC "ISO/IEC 10179:1996//DTD DSSSL Architecture//EN" "openjade/dsssl.dtd"
PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN" "openjade/style-sheet.dtd"
PUBLIC "-//OpenJade//DTD DSSSL Style Sheet//EN" "openjade/style-sheet.dtd"
SYSTEM "openjade/builtins.dsl" "openjade/builtins.dsl"
CATALOG "docbook/docbook.cat"
CATALOG "docbook-dsssl-1.79/catalog"

The directory layout i'm going to work with looks as follows:

-rw-r--r-- 1 root admin 541 4 Sep 10:22 catalog
drwxr-xr-x 18 root admin 612 4 Sep 15:03 docbook/
drwxr-xr-x 21 root admin 714 3 Sep 23:34 docbook-dsssl-1.79/
drwxr-xr-x 7 root admin 238 3 Sep 14:10 html/
drwxr-xr-x 22 root admin 748 3 Sep 14:10 iso8879/
drwxr-xr-x 12 root admin 408 4 Sep 10:24 openjade/

The HTML docs are finally build by the commands

$ eval ./configure $(pg_config --configure) DOCBOOKSTYLE=/opt/local/share/sgml/
$ cd doc/src/sgml; make

Leave your comments if you have some additional suggestions or better ideas.