One of the new features for 1.1 is the ability to serialize the updates to go out into log files that can be kept in a spool directory.
The spool files could then be transferred via whatever means was desired to a “slave system,” whether that be via FTP, rsync, or perhaps even by pushing them onto a 1GB “USB key” to be sent to the destination by clipping it to the ankle of some sort of “avian transport” system.
There are plenty of neat things you can do with a data stream in this form, including:
Replicating to nodes that aren't securable
Replicating to destinations where it is not possible to set up bidirection communications
Supporting a different form of PITR (Point In Time Recovery) that filters out read-only transactions and updates to tables that are not of interest.
If some disaster strikes, you can look at the logs of queries in detail
This makes log shipping potentially useful even though you might not intend to actually create a log-shipped node.
This is a really slick scheme for building load for doing tests
We have a data “escrow” system that would become incredibly cheaper given log shipping
You may apply triggers on the “disconnected node ” to do additional processing on the data
For instance, you might take a fairly “stateful” database and turn it into a “temporal” one by use of triggers that implement the techniques described in [Developing Time-Oriented Database Applications in SQL ] by Richard T. Snodgrass.
14.1. | Where are the “spool files” for a subscription set generated? |
Any slon subscriber node
can generate them by adding the NoteNotice that this implies that in order to use log shipping, you must have at least one subscriber node. |
|
14.2. | |
Nothing special. So long as the archiving node remains a subscriber, it will continue to generate logs. |
|
14.3. | What if we run out of “spool space”? |
The node will stop accepting |
|
14.4. | How do we set up a subscription? |
The script in You need to start the slon for the subscriber node with logging turned on.
At any point after that, you can run
slony1_dump.sh, which will pull the state
of that subscriber as of some |
|
14.5. | What are the limitations of log shipping? |
In the initial release, there are rather a lot of limitations. As releases progress, hopefully some of these limitations may be alleviated/eliminated. |
|
The log shipping functionality amounts to “sniffing” the data applied at a particular subscriber node. As a result, you must have at least one “regular” node; you cannot have a cluster that consists solely of an origin and a set of “log shipping nodes.”. |
|
The “log shipping node” tracks the entirety of the traffic going to a subscriber. You cannot separate things out if there are multiple replication sets. |
|
The “log shipping node” presently only
fully tracks A number of event types are handled in such a way that log shipping copes with them:
|
|
It would be nice to be able to turn a “log shipped” node into a fully communicating Slony-I node that you could failover to. This would be quite useful if you were trying to construct a cluster of (say) 6 nodes; you could start by creating one subscriber, and then use log shipping to populate the other 4 in parallel. This usage is not supported, but presumably one could add the Slony-I configuration to the node, and promote it into being a new node. Again, a Simple Matter Of Programming (that might not necessarily be all that simple)... |
Here are some more-or-less disorganized notes about how you might want to use log shipping...
You don't want to blindly apply
SYNC
files because any given
SYNC
file may not be the right
one. If it's wrong, then the result will be that the call to
setsyncTracking_offline()
will fail, and your
psql session will ABORT
, and then run through the remainder of that
SYNC
file looking for a COMMIT
or ROLLBACK
so that it can try to move on to the
next transaction.
But we know that the entire remainder of the file will fail! It is futile to go through the parsing effort of reading the remainder of the file.
Better idea:
Read the first few lines of the file, up to and
including the setsyncTracking_offline()
call.
Try to apply it that far.
If that failed, then it is futile to continue;
ROLLBACK
the transaction, and perhaps consider
trying the next file.
If the setsyncTracking_offline()
call succeeds, then you have the right next
SYNC
file, and should apply it. You should
probably ROLLBACK
the transaction, and then use
psql to apply the entire file full of
updates.
In order to support the approach of grabbing just the first few lines of the sync file, the format has been set up to have a line of dashes at the end of the “header” material:
-- Slony-I log shipping archive -- Node 11, Event 656 start transaction; select "_T1".setsyncTracking_offline(1, '655', '656', '2005-09-23 18:37:40.206342'); -- end of log archiving header
Note that the header includes a timestamp indicating SYNC time.
-- Slony-I log shipping archive -- Node 11, Event 109 start transaction; select "_T1".setsyncTracking_offline(1, '96', '109', '2005-09-23 19:01:31.267403'); -- end of log archiving header
This timestamp represents the time at which the
SYNC
was issued on the origin node.
The value is stored in the log shipping configuration table sl_setsync_offline.
If you are constructing a temporal database, this is likely to represent the time you will wish to have apply to all of the data in the given log shipping transaction log.
You may find information on how relevant activity is logged in Section 23.4.1, “ Log Messages Associated with Log Shipping ”.