Database Forum / Informix Topics / July 2008
Fragmentation - your ideas?
|
|
Thread rating:  |
iiug@perrior.net - 29 Jul 2008 17:01 GMT (HP-UX 11i; IDS 9.30HC5; OLTP mainly)
We have a table, "cashbook", on the system which contains 50 million rows at present; the entries are saved with a current status flag denoting whether it's in one of several states (active, pending, not active and so on). Most of the rows (48.8 million) are in the not_active state but we have to keep them on the system for historical lookup purposes. We've currently got the table fragmented so that, broadly, the actives (about 680k of them) are in one fragment and the not_actives and similar are in the other; there is no "remainder" fragment, the values are always clearly defined. Most daytime queries will be for "active" records; there'll be a few for the other states but only around 5-10% of the total queries. At cashup time at end-of-day, end-of-fiscal-period and so on, there's a bit of movement from 'active' to 'not_active' and some new 'actives' are created but the proportions remain broadly similar all the time. We find that this strategy works well as fragment elimination by status means that queries for 'active' records only search a fragment with 680k rows instead of the other 49 million. Whilst idly considering a tidyup of the rest of the database, some table moves to new dbspaces etc, I've been considering fragmenting the table across 10 dbspaces in round-robin; is this worth doing or is it best to leave it as is?
Jack Parker - 30 Jul 2008 13:30 GMT I am puzzled as to how you would do round robin on 10 fragments and retain your 'active flag' fragmentation on an 11th, or has the two level fragmentation scheme finally made it to 9?
I was going to write a long diatribe disparaging your setup - on reflection it sounds quite reasonable. Your updates are at EOD, so the cost of migrating data from one fragment to another is borne at that time instead of during each transaction. The concept of seperating historical data from active data into separate tables and then constructing a view (or something) is handled nicely by the fragmentation layout - so you've asked the engine to handle the dirty work. Nice.
I assume your index fragmentation scheme matches the data fragmentation scheme so that queries for active rows are traversing a smaller index. That index is something that should be monitored and occasionally rebuilt.
Round robin will improve load performance and full table scan performance, you will forfeit any fragment elimination for filtered queries.
j.
Sane ego te vocavi. Forsitan capedictum tuum desit.
-----Original Message----- From: informix-list-bounces@iiug.org [mailto:informix-list-bounces@iiug.org]On Behalf Of iiug@perrior.net Sent: Tuesday, July 29, 2008 12:02 PM To: informix-list@iiug.org Subject: Fragmentation - your ideas?
(HP-UX 11i; IDS 9.30HC5; OLTP mainly)
We have a table, "cashbook", on the system which contains 50 million rows at present; the entries are saved with a current status flag denoting whether it's in one of several states (active, pending, not active and so on). Most of the rows (48.8 million) are in the not_active state but we have to keep them on the system for historical lookup purposes. We've currently got the table fragmented so that, broadly, the actives (about 680k of them) are in one fragment and the not_actives and similar are in the other; there is no "remainder" fragment, the values are always clearly defined. Most daytime queries will be for "active" records; there'll be a few for the other states but only around 5-10% of the total queries. At cashup time at end-of-day, end-of-fiscal-period and so on, there's a bit of movement from 'active' to 'not_active' and some new 'actives' are created but the proportions remain broadly similar all the time. We find that this strategy works well as fragment elimination by status means that queries for 'active' records only search a fragment with 680k rows instead of the other 49 million. Whilst idly considering a tidyup of the rest of the database, some table moves to new dbspaces etc, I've been considering fragmenting the table across 10 dbspaces in round-robin; is this worth doing or is it best to leave it as is? _______________________________________________ Informix-list mailing list Informix-list@iiug.org http://www.iiug.org/mailman/listinfo/informix-list
Zachi - 30 Jul 2008 14:42 GMT > I am puzzled as to how you would do round robin on 10 fragments and retain > your 'active flag' fragmentation on an 11th, or has the two level [quoted text clipped - 54 lines] > Informix-list mailing list > Informix-l...@iiug.orghttp://www.iiug.org/mailman/listinfo/informix-list don't change to round-robin: create new fragments with some expression as to split the load of the inactive ones. Going by date (year?) or some other expression (tailored to your needs) should do fine. You should have something like this:
fragment 1: status='active' fragment 2: status<>'active' and year=2000 fragment 3: status<>'active' and year=2001
and so on. it will take a short while to rearrange the data, but it will pay off eventually (I assume you have a performance issue to begin with - otherwise, you can stay as you are).
Zachi
iiug@perrior.net - 30 Jul 2008 15:13 GMT Hi to all Thanks for the inputs, it's more or less what I was going to be doing anyway; leave it as it is because it isn't broke! There aren't any performance issues, there's not enough continuous activity for that, but it needs a rebuild sooner or later because it's taking up over 30 extents. Actually, as the data itself is now 33Gb and as we're still on V9, that means we can't expect to get below 17 extents anyway (2Gb limit); I've posted before on why the company won't/haven't moved on from V9, and I'm bored of banging my head on the wall over that one. A 7-year (legal requirement) retention period means we can't dump the old data and it would need a pretty comprehensive code redesign (and subsequent test harnesses which we don't have the budget to even write, let alone get testers to try to break all of the 1,000 or so 4ge's) to break out the inactive data to a subsidiary table, so we're stuck with keeping it all in the one table.
Yes of course the indexes are separate!
Jarrod Teale - 30 Jul 2008 21:16 GMT If you do end up fragmenting on date, try to avoid the date cast in the fragment expression - this is resource hungry. If possible, add another column (year as in interger for example) and during the insert, cast then. Then set the fragment expression on the integer column. PDQ and scanning will all go much faster as there isn't a per-row cast happening every time.
Jarrod
-----Original Message----- From: informix-list-bounces@iiug.org [mailto:informix-list-bounces@iiug.org] On Behalf Of iiug@perrior.net Sent: Thursday, 31 July 2008 2:14 a.m. To: informix-list@iiug.org Subject: Re: Fragmentation - your ideas?
Hi to all Thanks for the inputs, it's more or less what I was going to be doing anyway; leave it as it is because it isn't broke! There aren't any performance issues, there's not enough continuous activity for that, but it needs a rebuild sooner or later because it's taking up over 30 extents. Actually, as the data itself is now 33Gb and as we're still on V9, that means we can't expect to get below 17 extents anyway (2Gb limit); I've posted before on why the company won't/haven't moved on from V9, and I'm bored of banging my head on the wall over that one. A 7-year (legal requirement) retention period means we can't dump the old data and it would need a pretty comprehensive code redesign (and subsequent test harnesses which we don't have the budget to even write, let alone get testers to try to break all of the 1,000 or so 4ge's) to break out the inactive data to a subsidiary table, so we're stuck with keeping it all in the one table.
Yes of course the indexes are separate!
_______________________________________________ Informix-list mailing list Informix-list@iiug.org http://www.iiug.org/mailman/listinfo/informix-list
DISCLAIMER: This email contains confidential information and may be legally privileged. If you are not the intended recipient or have received this email in error, please notify the sender immediately and destroy this email. You may not use, disclose or copy this email or its attachments in any way. Any opinions expressed in this email are those of the author and are not necessarily those of the Fonterra Co-operative Group. http://www.fonterra.com/
|
|
|