Database Forum / General DB Topics / DB Theory / October 2008
Replication in databases
|
|
Thread rating:  |
M2Y - 01 Sep 2008 19:56 GMT Hello,
I have a fundamental doubt regarding replication in databases.
If the isolation level is set to Serializable, why cant we just ship the transaction statements from the master to the backup and replay those transactions. Why do we adopt complicated solutions like log shipping and all such kind of things for replicating databases.
Thanks, Srinivas
Bob Badour - 01 Sep 2008 21:06 GMT > Hello, > [quoted text clipped - 7 lines] > Thanks, > Srinivas The statements might arrive at the slave in a different order and the slave might serialize the transactions differently--for a start.
paul c - 02 Sep 2008 00:41 GMT >> Hello, >> [quoted text clipped - 10 lines] > The statements might arrive at the slave in a different order and the > slave might serialize the transactions differently--for a start. from my own experience, another reason is that people who are charged with implementing physical services don't think big (and often are ignorant of product semantics). can't remember how many times I've seen an unstated requirement for secondary mostly read-only db's that can tolerate being minutes or even hours out-of-date.
M2Y - 02 Sep 2008 05:05 GMT > > Hello, > [quoted text clipped - 10 lines] > The statements might arrive at the slave in a different order and the > slave might serialize the transactions differently--for a start. We can maintain an order of transactions, lets say, commit order and let the backup replay the transactions in that order.
David BL - 02 Sep 2008 05:12 GMT > > > Hello, > [quoted text clipped - 13 lines] > We can maintain an order of transactions, lets say, commit order and > let the backup replay the transactions in that order. Isn't that basically what log shipping does anyway?
Christopher Browne - 09 Sep 2008 05:51 GMT In the last exciting episode, M2Y <mailtoyahoo@gmail.com> wrote:
>> > Hello, >> [quoted text clipped - 13 lines] > We can maintain an order of transactions, lets say, commit order and > let the backup replay the transactions in that order. Well, yes, you can maintain an order, by some means.
That tends to impose requirements like log shipping, as you need some means to impose the order.
 Signature let name="cbbrowne" and tld="gmail.com" in name ^ "@" ^ tld;; http://linuxdatabases.info/info/finances.html When sign makers go on strike, is anything written on their signs?
DBMS_Plumber - 05 Sep 2008 19:52 GMT > If the isolation level is set to Serializable, why cant we just ship > the transaction statements from the master to the backup and replay > those transactions. Why do we adopt complicated solutions like log > shipping and all such kind of things for replicating databases. Some things to think about:
1. Queries aren't the only way data is loaded into a database; bulk data loading, for example.
2. Query shipping (as it's called) requires that identical versions of the database schema are maintained on every node. Dropping or creating a trigger in one place but not all places will affect query semantics. Talking to DBAs, they don't want to do this, as one frequent use for replication is to maintain an analytic-queries-only copy of some operational store.
3. What happens if the data on the two nodes is different? Consider a many-to-one replication from a number of retail stores to a central warehouse. Queries on child-node A (which has a sub-set of the overall database) might affect a different (much larger) data set on the parent-node.
Folk who design replication schemes need to consider a broader range of deployment scenarios than the one you've described. Frankly, in your situation. query shipping would be a better solution (assuming you have no view, triggers, stored procedures etc). But log sniffing hits a broader set of the cases customers seem to care about.
Christoph Rupp - 16 Oct 2008 22:18 GMT I'm implementing my own database, and therefore i was thinking about the same question. (i have not yet implemented recovery, because there's so much other stuff to do).
Currently my idea is to send the log over the net, because in my design i do not log physical pages, instead i just log the modified key/record items.
Of course i could also send the queries, but that's way more complex to implement. Assume you want to insert a few items into your local database. The query is sent to the remote node, and in case of success it's also inserted locally. If you send the query to the remote node, but then don't receive an answer because your network switch exploded, you do not know which part of the query (if any) was already executed, and then recovery becomes difficult (and network problems happen very often).
If you send the log, recovery is easier because each log entry has a unique ID (lsn - log serial number) and can be applied atomically.
David BL - 17 Oct 2008 07:21 GMT > I'm implementing my own database, and therefore i was thinking about > the same question. (i have not yet implemented recovery, because [quoted text clipped - 3 lines] > design i do not log physical pages, instead i just log the modified > key/record items. I recall your post 6 months ago :)
Are your log records idempotent?
Christoph Rupp - 17 Oct 2008 13:44 GMT > > I'm implementing my own database, and therefore i was thinking about > > the same question. (i have not yet implemented recovery, because [quoted text clipped - 7 lines] > > Are your log records idempotent? Yes, that was me :)
however, 6 months ago i implemented logging for my first database - hamsterdb (hamsterdb.com). It's physical logging, writing modified database pages. hamsterdb is good for embedded devices, but has no concurrency and really bad support for transactions.
3 months ago i started writing hamsterdb2, which has a high level of concurrency, full support for transactions and record-level logging instead of page-level logging. and this is still work in progress.
but you can be sure that sooner or later i will continue asking on comp.databases.theory strange questions about different transaction isolation levels and other esoteric stuff :)
Christoph Rupp - 17 Oct 2008 14:15 GMT > > I'm implementing my own database, and therefore i was thinking about > > the same question. (i have not yet implemented recovery, because [quoted text clipped - 7 lines] > > Are your log records idempotent? btw - regarding idempotence - this is a challenge.
in case of physical logging, a simple insert usually modifies multiple pages (especially i.e. if the btree is split). so the replication host has to send several database pages, which is a huge overhead, and in case of a problem (i.e. network communication problem or a system crash on the replication node) the remote database will be corrupt if only one of multiple modified pages was applied.
in my design i avoid that problem because my index trees only have atomic operations. so if i.e. an insert operation is sent over the network, i know that it's either applied or not. there's no recovery needed, and my log records do not need to be idempotent.
this atomic design of my index tree may come at the cost of performance (compared to a btree, as i implemented it for hamsterdb), but i think i will be able to avoid performance problems by moving all writing index operations to a background thread.
now i just need about 12 months of vacation to implement all this :)
|
|
|