Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / DB2 Topics / April 2006

Tip: Looking for answers? Try searching our database.

Help performace question DB2 UDB v 8.1.9 Linux

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Bob Stearns - 11 Apr 2006 20:36 GMT
The following query runs very slowly compared to other queries on my
system. If I read the explain plan and its data correctly, it is due to
a table scan on animals caused by the join separated from the others by
space. There is an index on both dam_bhid and recip_bhid. I have tried
converting the on condition to a case statement to no good effect.
Likewise reordering the conditions. Do you have any suggestions.

The semantics of the join is that an1 has a calf on the ground if there
is a calf who she is the dam of when there is no recipient for that calf
or there is a calf of which she was the recipient. A calf is less than
283 days old and unweaned.

        select
                an1.bhid,
                an1.breed_1,
                an1.pct_1,
                ap_donor.herd_id                        as donor_id,
                ap_sire.herd_id                         as sire_id,
                ap_dam.herd_id                          as dam_id,
                loc.namex                               as loc_name,
                lh.start_date                           as as_of_date,
                case
                    when wn.WEIGH_DATE is null
                        then ap_calf.herd_id
                    else ''
                end                                     AS calf_id,
                case
                    when wn.WEIGH_DATE is null
                        then char(an_calf.BIRTH_DATE, usa)
                    else ''
                end                                  AS calf_birth_date,
                case
                    when wn.weigh_date is null
                        then an_calf.sex
                    else ''
                end                                     AS calf_sex,
                ap_calf_sire.herd_id                    as calf_sire_id,
                col_def.COLOR_DESC                      as calf_color,
                char((select max(act_recov_date)
                        from $schema.flushes
                       WHERE donor_bhid=an1.bhid), usa)
                    AS last_flush_date,
                char(proj_mate.BREED_TO_DATE, usa)as proj_breed_date,
                proj_mate_detail.herd_id          as proj_breed_mate

        from $schema.animals                            an1
        left outer
        join $schema.ANIMALS_PRIV                       ap_donor
                 on ap_donor.bhid = an1.BHID
                and ap_donor.herd_owner_id = $empid
        left outer
        join $schema.animals_priv                       ap_sire
                 on ap_sire.bhid = an1.SIRE_BHID
                and ap_sire.herd_owner_id = $empid
        left outer
        join $schema.ANIMALS_PRIV                       ap_dam
                 on ap_dam.bhid = an1.DAM_BHID
                and ap_dam.herd_owner_id = $empid
        left outer
        join $schema.locations_hist                     lh
                 on an1.bhid = lh.bhid
                and lh.end_date is null
        left outer
        join $schema.locations                          loc
                 on loc.loc_id = lh.loc_id

-- The problem is here

        left outer
        join $schema.animals                            an_calf
                 on days(current_date) -
                    days(an_calf.birth_date) < 280
                and an_calf.ACTIVEX = 'Y'
                and ((an_calf.dam_bhid = an1.bhid AND
                      an_calf.recip_bhid = 0)
                     OR an_calf.RECIP_BHID = an1.bhid)

-- end problem

        left outer
        join $schema.animals_priv                       ap_calf
                 on ap_calf.HERD_OWNER_ID = $empid
                and ap_calf.bhid = an_calf.bhid
        left outer
        join $schema.weaning                            wn
                 on wn.bhid = an_calf.bhid
        left outer
        join $schema.ANIMALS_PRIV                       ap_calf_sire
                 on an_calf.SIRE_BHID=ap_calf_sire.bhid
                and ap_calf_sire.herd_owner_id = $empid
        left outer
        join $schema.COLOR_DEFN                         col_def
                 on col_def.BH_COLOR_CODE=an_calf.COLOR
        left outer
        join $schema.ANIMALS_PRIV_$empid                proj_mate
                 on proj_mate.bhid = an1.bhid
        left outer
        join $schema.ANIMALS_PRIV_$empid                proj_mate_detail
                 on proj_mate_detail.BHID=
                    proj_mate.BREED_TO_BHID
        where
            an1.BHID in
                (select bhid
                   from $schema.animal_sets
                  where set_name = '$setname'
                    and userid='$setuser')
Serge Rielau - 12 Apr 2006 05:21 GMT
>         left outer
>         join $schema.animals                            an_calf
[quoted text clipped - 4 lines]
>                       an_calf.recip_bhid = 0)
>                      OR an_calf.RECIP_BHID = an1.bhid)
There are two potential problems. The first is the OR which pretty much
makes teh last AND predicate useless for index exploitation.
The second is the expression on days which disallows index exploitation
on birth_date and, worse, forces an NLJOIN.
What about a small rewrite:
an_calf.birthdate > current date - 280 days
Now DB2 has a clear startkey for an index scan over birth date.

Cheers
Serge

Signature

Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Bob Stearns - 12 Apr 2006 18:12 GMT
>>         left outer
>>         join $schema.animals                            an_calf
[quoted text clipped - 15 lines]
> Cheers
> Serge

Thank you. Creating the index and changing the query as you indicated
changed the time from over a minute to less than 2 seconds.

How could I convince the optimizer to do what I would do in "procedural
code", which uses the dam_bhid and recip_bhid indices to minimize the
time required (in no actual language that I know of):
    found = -1
    while (SELECT * FROM animals WHERE dam_bhid=an1.bhid) {
    if(recip_bhid=0} continue
    if(activex<>'Y') continue
    if(birth_date < current_date-280 days) continue
    found = bhid
    break
    }
    if(found=-1) {
    while (SELECT * FROM animals WHERE recip_bhid=an1.bhid) {
       if(activex<>'Y') continue
       if(birth_date < current_date-280 days) continue
       found = bhid
       break
        }
    }
    if(found=-1) {nomatch)
    else (matched the current record, with bhid=found}
Serge Rielau - 12 Apr 2006 20:00 GMT
>>>         left outer
>>>         join $schema.animals                            an_calf
[quoted text clipped - 22 lines]
> code", which uses the dam_bhid and recip_bhid indices to minimize the
> time required (in no actual language that I know of):

CREATE FUNCTION purplecow(...) RETURNS VARCHAR(10)
READS SQL DATA
BEGIN ATOMIC
>     found = -1
DECLARE found INTEGER;
>     while (SELECT * FROM animals WHERE dam_bhid=an1.bhid) {
>     if(recip_bhid=0} continue
[quoted text clipped - 3 lines]
>     break
>     }
SET found = (SELECT bhid FROM animals
  WHERE dam_bhid=an1.bhid
    AND recip_bhid <> 0
    AND activex='Y'
    AND birth_date >= current_date-280 days
FETCH FIRST ROW ONLY)
>     if(found=-1) {
>     while (SELECT * FROM animals WHERE recip_bhid=an1.bhid) {
[quoted text clipped - 4 lines]
>         }
>     }
SET found = (SELECT bhid FROM animals
  WHERE recip_bhid=an1.bhid
    AND activex = 'Y'
    AND birth_date >= current_date - 280 days
    AND found IS NULL
 FETCH FIRST ROW ONLY);
>     if(found=-1) {nomatch)
>     else (matched the current record, with bhid=found}
RETURN CASE WHEN found IS NULL THEN 'No match'
            ELSE CHAR(found) END;
END
@

VALUES purplecow();

If you want to take the optimizer out of the equation SQL Functions
rock. Need to be careful with DPF, but in serial they contribute their
fair share to our TPC-C results.

Cheers
Serge
Signature

Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Bob Stearns - 13 Apr 2006 00:44 GMT
> CREATE FUNCTION purplecow(...) RETURNS VARCHAR(10)
> READS SQL DATA
[quoted text clipped - 29 lines]
> Cheers
> Serge
Could I easily return the entire row found rather than its key? It would
dave 1 join. How do I use it in the middle of a long JOIN sequence?

from an1
.
.
.
left outer join values purplecow(an1.bhid)
      as tx(bhid)
      on tx.bhid is not null
left outer join animals an_calf
      on an_calf.bhid=tx.bhid
.
.
.

is my best guess, but I'm sure that's not quite right
Serge Rielau - 13 Apr 2006 07:49 GMT
>> CREATE FUNCTION purplecow(...) RETURNS VARCHAR(10)
>> READS SQL DATA
[quoted text clipped - 46 lines]
>
> is my best guess, but I'm sure that's not quite right
If you want to return the entire row you sue a TABLE function.
Simply change the function to:
RETURNS TABLE (pk INT, C1 INT, ...)

and the return to:
RETURN VALUES (......);
..
of course you will also have to extend the SET statements
SET (...) = (SELECT ....)

take a look here for examples:
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0411rielau/

Cheers
Serge

Signature

Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.