Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / Informix Topics / July 2008

Tip: Looking for answers? Try searching our database.

Fragmentation - your ideas?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
iiug@perrior.net - 29 Jul 2008 17:01 GMT
(HP-UX 11i; IDS 9.30HC5; OLTP mainly)

We have a table, "cashbook", on the system which contains 50 million
rows at present; the entries are saved with a current status flag
denoting whether it's in one of several states (active, pending, not
active and so on).
Most of the rows (48.8 million) are in the not_active state but we
have to keep them on the system for historical lookup purposes.
We've currently got the table fragmented so that, broadly, the actives
(about 680k of them) are in one fragment and the not_actives and
similar are in the other; there is no "remainder" fragment, the values
are always clearly defined.
Most daytime queries will be for "active" records; there'll be a few
for the other states but only around 5-10% of the total queries. At
cashup time at end-of-day, end-of-fiscal-period and so on, there's a
bit of movement from 'active' to 'not_active' and some new 'actives'
are created but the proportions remain broadly similar all the time.
We find that this strategy works well as fragment elimination by
status means that queries for 'active' records only search a fragment
with 680k rows instead of the other 49 million.
Whilst idly considering a tidyup of the rest of the database, some
table moves to new dbspaces etc, I've been considering fragmenting the
table across 10 dbspaces in round-robin; is this worth doing or is it
best to leave it as is?
Jack Parker - 30 Jul 2008 13:30 GMT
I am puzzled as to how you would do round robin on 10 fragments and retain
your 'active flag' fragmentation on an 11th, or has the two level
fragmentation scheme finally made it to 9?

I was going to write a long diatribe disparaging your setup - on reflection
it sounds quite reasonable.  Your updates are at EOD, so the cost of
migrating data from one fragment to another is borne at that time instead of
during each transaction.  The concept of seperating historical data from
active data into separate tables and then constructing a view (or something)
is handled nicely by the fragmentation layout - so you've asked the engine
to handle the dirty work.  Nice.

I assume your index fragmentation scheme matches the data fragmentation
scheme so that queries for active rows are traversing a smaller index.  That
index is something that should be monitored and occasionally rebuilt.

Round robin will improve load performance and full table scan performance,
you will forfeit any fragment elimination for filtered queries.

j.

Sane ego te vocavi. Forsitan capedictum tuum desit.

-----Original Message-----
From: informix-list-bounces@iiug.org
[mailto:informix-list-bounces@iiug.org]On Behalf Of iiug@perrior.net
Sent: Tuesday, July 29, 2008 12:02 PM
To: informix-list@iiug.org
Subject: Fragmentation - your ideas?

(HP-UX 11i; IDS 9.30HC5; OLTP mainly)

We have a table, "cashbook", on the system which contains 50 million
rows at present; the entries are saved with a current status flag
denoting whether it's in one of several states (active, pending, not
active and so on).
Most of the rows (48.8 million) are in the not_active state but we
have to keep them on the system for historical lookup purposes.
We've currently got the table fragmented so that, broadly, the actives
(about 680k of them) are in one fragment and the not_actives and
similar are in the other; there is no "remainder" fragment, the values
are always clearly defined.
Most daytime queries will be for "active" records; there'll be a few
for the other states but only around 5-10% of the total queries. At
cashup time at end-of-day, end-of-fiscal-period and so on, there's a
bit of movement from 'active' to 'not_active' and some new 'actives'
are created but the proportions remain broadly similar all the time.
We find that this strategy works well as fragment elimination by
status means that queries for 'active' records only search a fragment
with 680k rows instead of the other 49 million.
Whilst idly considering a tidyup of the rest of the database, some
table moves to new dbspaces etc, I've been considering fragmenting the
table across 10 dbspaces in round-robin; is this worth doing or is it
best to leave it as is?
_______________________________________________
Informix-list mailing list
Informix-list@iiug.org
http://www.iiug.org/mailman/listinfo/informix-list
Zachi - 30 Jul 2008 14:42 GMT
> I am puzzled as to how you would do round robin on 10 fragments and retain
> your 'active flag' fragmentation on an 11th, or has the two level
[quoted text clipped - 54 lines]
> Informix-list mailing list
> Informix-l...@iiug.orghttp://www.iiug.org/mailman/listinfo/informix-list

don't change to round-robin: create new fragments with some expression
as to split the load of the inactive ones. Going by date (year?)  or
some other expression (tailored to your needs) should do fine. You
should have something like this:

fragment 1: status='active'
fragment 2: status<>'active' and year=2000
fragment 3: status<>'active' and year=2001

and so on. it will take a short while to rearrange the data, but it
will pay off eventually (I assume you have a performance issue to
begin with - otherwise, you can stay as you are).

Zachi
iiug@perrior.net - 30 Jul 2008 15:13 GMT
Hi to all
Thanks for the inputs, it's more or less what I was going to be doing
anyway; leave it as it is because it isn't broke!
There aren't any performance issues, there's not enough continuous
activity for that, but it needs a rebuild sooner or later because it's
taking up over 30 extents.
Actually, as the data itself is now 33Gb and as we're still on V9,
that means we can't expect to get below 17 extents anyway (2Gb limit);
I've posted before on why the company won't/haven't moved on from V9,
and I'm bored of banging my head on the wall over that one.
A 7-year (legal requirement) retention period means we can't dump the
old data and it would need a pretty comprehensive code redesign (and
subsequent test harnesses which we don't have the budget to even
write, let alone get testers to try to break all of the 1,000 or so
4ge's)  to break out the inactive data to a subsidiary table, so we're
stuck with keeping it all in the one table.

Yes of course the indexes are separate!
Jarrod Teale - 30 Jul 2008 21:16 GMT
If you do end up fragmenting on date, try to avoid the date cast in the
fragment expression - this is resource hungry.
If possible, add another column (year as in interger for example) and
during the insert, cast then. Then set the fragment expression on the
integer column. PDQ and scanning will all go much faster as there isn't
a per-row cast happening every time.

Jarrod

-----Original Message-----
From: informix-list-bounces@iiug.org
[mailto:informix-list-bounces@iiug.org] On Behalf Of iiug@perrior.net
Sent: Thursday, 31 July 2008 2:14 a.m.
To: informix-list@iiug.org
Subject: Re: Fragmentation - your ideas?

Hi to all
Thanks for the inputs, it's more or less what I was going to be doing
anyway; leave it as it is because it isn't broke!
There aren't any performance issues, there's not enough continuous
activity for that, but it needs a rebuild sooner or later because it's
taking up over 30 extents.
Actually, as the data itself is now 33Gb and as we're still on V9, that
means we can't expect to get below 17 extents anyway (2Gb limit); I've
posted before on why the company won't/haven't moved on from V9, and I'm
bored of banging my head on the wall over that one.
A 7-year (legal requirement) retention period means we can't dump the
old data and it would need a pretty comprehensive code redesign (and
subsequent test harnesses which we don't have the budget to even write,
let alone get testers to try to break all of the 1,000 or so
4ge's)  to break out the inactive data to a subsidiary table, so we're
stuck with keeping it all in the one table.

Yes of course the indexes are separate!

_______________________________________________
Informix-list mailing list
Informix-list@iiug.org
http://www.iiug.org/mailman/listinfo/informix-list

DISCLAIMER:
This email contains confidential information and may be legally privileged. If you are not the intended recipient or have received this email in error, please notify the sender immediately and destroy this email.
You may not use, disclose or copy this email or its attachments in any way.
Any opinions expressed in this email are those of the author and are not necessarily those of the Fonterra Co-operative Group.
http://www.fonterra.com/
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.