15. Database Schema Changes (DDL)

When changes are made to the database schema, e.g. - adding fields to a table, it is necessary for this to be handled rather carefully, otherwise different nodes may get rather deranged because they disagree on how particular tables are built.

If you pass the changes through Slony-I via EXECUTE SCRIPT (slonik) / schemadocddlscript_complete( integer, text, integer ) (stored function), this allows you to be certain that the changes take effect at the same point in the transaction streams on all of the nodes. That may not be so important if you can take something of an outage to do schema changes, but if you want to do upgrades that take place while transactions are still winding their way through your systems, this is necessary.

It is essential to use EXECUTE SCRIPT if you alter tables so as to change their schemas. If you do not, then you may run into the problems described here where triggers on modified tables do not take account of the schema change. This has the potential to corrupt data on subscriber nodes.

It's worth making a couple of comments on “special things” about EXECUTE SCRIPT:

Unfortunately, this nonetheless implies that the use of the DDL facility is somewhat fragile and fairly dangerous. Making DDL changes must not be done in a sloppy or cavalier manner. If your applications do not have fairly stable SQL schemas, then using Slony-I for replication is likely to be fraught with trouble and frustration. See the section on locking issues for more discussion of related issues.

There is an article on how to manage Slony-I schema changes here: Varlena General Bits

15.1.  Changes that you might not want to process using EXECUTE SCRIPT

While it is vitally necessary to use EXECUTE SCRIPT to propagate DDL modifications to tables that are being replicated, there are several sorts of changes that you might wish to handle some other way:

  • There are various sorts of objects that don't have triggers that Slony-I doesn't replicate, such as stored functions, and it is quite likely to cause you grief if you propagate updates to them associated with a replication set where EXECUTE SCRIPT will lock a whole lot of tables that didn't really need to be locked.

    If you are propagating a stored procedure that isn't used all the time (such that you'd care if it was briefly out of sync between nodes), then you could simply submit it to each node using psql, making no special use of Slony-I.

    If it does matter that the object be propagated at the same location in the transaction stream on all the nodes, then you but no tables need to be locked, then you need to use EXECUTE SCRIPT, locking challenges or no.

  • You may want an extra index on some replicated node(s) in order to improve performance there.

    For instance, a table consisting of transactions may only need indices related to referential integrity on the “origin” node, and maximizing performance there dictates adding no more indices than are absolutely needed. But nothing prevents you from adding additional indices to improve the performance of reports that run against replicated nodes.

    It would be unwise to add additional indices that constrain things on replicated nodes, as if they find problems, this leads to replication breaking down as the subscriber(s) will be unable to apply changes coming from the origin that violate the constraints.

    But it's no big deal to add some performance-enhancing indices. You should almost certainly not use EXECUTE SCRIPT to add them; that leads to some replication set locking and unlocking tables, and possibly failing to apply the event due to some locks outstanding on objects and having to retry a few times before it gets the change in. If you instead apply the index “directly” such as with psql, you can determine the time at which the table lock is introduced. Adding an index to a table will require an exclusive lock for the time it takes to build the index; that will implicitly stop replication, while the index builds, but shouldn't cause any particular problems. If you add an index on a table that takes 20 minutes to build, replication will block for 20 minutes, but should catch up quickly enough once the index is created.

  • Slony-I stores the “primary index” name in sl_table, and uses that name to control what columns are considered the “key columns” when the log trigger is attached. It would be plausible to drop that index and replace it with another primary key candidate, but changing the name of the primary key candidate would break things.

15.2.  Testing DDL Changes

A method for testing DDL changes has been pointed out as a likely “best practice.

You need to test DDL scripts in a non-destructive manner.

The problem is that if nodes are, for whatever reason, at all out of sync, replication is likely to fall over, and this takes place at what is quite likely one of the most inconvenient times, namely the moment when you wanted it to work.

You may indeed check to see if schema scripts work well or badly, by running them by hand, against each node, adding BEGIN; at the beginning, and ROLLBACK; at the end, so that the would-be changes roll back.

If this script works OK on all of the nodes, that suggests that it should work fine everywhere if executed via slonik. If problems are encountered on some nodes, that will hopefully allow you to fix the state of affairs on those nodes so that the script will run without error.

Warning

If the SQL script contains a COMMIT; somewhere before the ROLLBACK; , that may allow changes to go in unexpectedly.