> Thank you, Fred.
> Nothing helps in my situation.Obviosly this is DB2 Optimizer mistake.
> Microsot SQL Server handles my situations much better.

Signature
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Hi Serge,
Let me explain my situation.
I am working on OLAP project and my datamart is located on Microsoft SQL
Server.
I have moved this datamart to DB2 (UNIX) and Iwas impressed about
performance for ETL which loads data to DB2.
But my second step was to LOAD my OLAP database which is for DB2 means:
SELECT a,b,c,d,.....FROM View
- 60 millions rows, about 15 GB of data , takes forever!
( this process takes about 1hour 20 min with MS SQL Server.)
View = SELECT A (55millions row table) JOIN B (3 millions row tabel) ON
A.B_PK_CLUSTER = B.B_PK_CLUSTER
JOIN C (1 million rrow table) ON A.C_PK_CLUSTER =
C.C_PK_CLUSTER
LEFT JOIN D (small) ON PK_CLUSTER.......
Plan shows that DB2 TABLSCAN table A --> SORT it by B_PK_CLUSTER field and
HASH it to TABLESCANED table B!!
even when I add FETCH FIRST 1 ROW ONLY - takes forever!!
I changed my View like this:
SELECT a,b,c,d,.....FROM tableA ( just fact table)
and I loaded my OLAP in 15 min !! ( 60 millions rows, about 15 GB of
data ) - Fantastic.
What does it mean? I think it means I have no problem with OLAP,
network,drivers, etc.
So I tryed this ( only 2 tables) :
SELECT a,b,c, ......
FROM A (55millions row table) JOIN B (3 millions row tabel) ON
A.B_PK_CLUSTER = B.B_PK_CLUSTER
Plan is still ugly - 2 TABLESCANS - very slow. I think you may create this
situation very easy.
IF I add WHERE clause to narrow table A to let say 1 million rows - plan is
good, DB2 accept indexes, works fast.
What does it mean? I think it means I do have indexes and they are good.
After RANSTATS on just B_PK_CLUSTER field on table A .
My next SELECT selects ONLY fields from table A and ONLY PK from table B.:
SELECT A.f1, A.f2, A.B_PK_CLUSTER , B.B_PK_CLUSTER
FROM A (55millions row table) JOIN B (3 millions row tabel) ON
A.B_PK_CLUSTER = B.B_PK_CLUSTER
Plan is perfect, DB2 uses indexes , very fast.
But If I add to SELECT above some another fields from table B - plan again
becomes TABLESCAN type, very slow.
I tryed use Functions instead of JOINs -- it improves situation but still
very slow - about 6 hour to load my cube.
I have to know what to expect when I will need to load 100 million row, 200,
300.
Limits, thresholds, buffer sizes, pools, memory, CPUs, etc...$$$$?
How to hint to DB2 to use this particular index or that type of Join?
Please advice.
Thanks,
Timur.
> > Thank you, Fred.
> > Nothing helps in my situation.Obviosly this is DB2 Optimizer mistake.
[quoted text clipped - 12 lines]
> DB2 SQL Compiler Development
> IBM Toronto Lab
Serge Rielau - 30 May 2005 16:38 GMT
> Hi Serge,
> Let me explain my situation.
[quoted text clipped - 59 lines]
> Thanks,
> Timur.
How is the filtering of these joins?
Keep in mind that DB2 has to go get the row from the datapage to get
those extra columns.
Are you aware of the INCLUDE clause on unique indices?
To use it with a primary key CREATE UNIQUE INDEX .... ON T(pk) INCLUDE
(....)
Then: ALTER TABLE .. ADD PRIMARY KEY...
"An existing index <blah> has been reused"
If you can't use INCLUDE columns you can play a game:
First do the JOINS without pulling any of the secondary columns up. Pull
only the primary keys.
Then, when all the filtering is done (an dyou're down to 1M rows, join
back to the tables to pull up the extra, non indexed columns.
Possibly it's sufficient to do this trick only for the 55M table.
Again, if you would post the plan it would be easy to see the filtering
and WHY DB2 chooses the plan it does.
Cheers
Serge

Signature
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab