Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / DB2 Topics / January 2006

Tip: Looking for answers? Try searching our database.

Query optimizing problem

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Bernd Hohmann - 09 Jan 2006 19:49 GMT
Maybe someone has an idea for this:

table "customer_article" contains prices for every article by customer
and has about 400.000 now.

customer char(5)
article  char(5)
price    double

There is a table "customer_article_bak" with the same structure but
contains the situation before the data update.

Now I like a report like "Show me all new articles but only if the
customer exists in the old table" (means: only existing customers get a
hint which article are new to them).

My first approach is doing something, but after 1 hour I stopped the query.

select new.* from customer_article as new
  full outer join customer_article_bak as old
    on new.customer=old.customer
  where old.article IS null

Any ideas how to speed this up?

Bernd

Signature

In den USA war die letzte Zigarette, die man dem zum Tode Verurteilten
gewährte, eine Tradition, ein angestammtes Recht, bis vor kurzem der
Gouverneur von Alabama diese verbot - und zwar aus gesundheitlichen Gründen.

richard_mccutcheon@hcm.honda.com - 09 Jan 2006 20:22 GMT
How does it behave if you use an inner join instead of an outer join
(your WHERE clause will turn the outer join into an inner join
anyway.......) ?

Also, the classic questions are:

1.  what indexes do you have?
2.  have you run reorg and runstats lately?
3.  are you table scanning?

Regards,
Richard M.
Bernd Hohmann - 09 Jan 2006 22:20 GMT
> How does it behave if you use an inner join instead of an outer join

0 Records found

> Also, the classic questions are:
>
> 1.  what indexes do you have?

PRIMARY KEY ("ARTICLE","CUSTOMER") on both tables

> 2.  have you run reorg and runstats lately?

db2 reorg table cmm.customer_article

db2 runstats on table cmm.customer_article with distribution and
detailed indexes all shrlevel change

Yes...

> 3.  are you table scanning?

If you like to flagellate yourself:

http://www.harddiskcafe.de/db2/sql.txt has the original table
definition, the query and the "optimized" query.

http://www.harddiskcafe.de/db2/sql.jpg is the explain graph.

So if you ask for table scanning: yes, the beast does a lot of it <g>

Bernd
Art S. Kagel - 09 Jan 2006 21:38 GMT
> Maybe someone has an idea for this:
>
[quoted text clipped - 20 lines]
>
> Any ideas how to speed this up?

How about:

select new.*
from customer_article as new
where exists (
    select 1
    from customer_article_bak as old
    where old.customer = new.customer
);

OR even better:

select new.*
from customer_article as new
where customer in (
    select unique old.customer
    from customer_article_bak as old
);

Art S. Kagel
Bernd Hohmann - 09 Jan 2006 22:05 GMT
>> Any ideas how to speed this up?
>
[quoted text clipped - 7 lines]
>     where old.customer = new.customer
> );

Works great - it pumps zillions of rows on the monitor but doesn't
answer the question "which article is in customer_article but not in
customer_article_bak" :)

Bernd
Art S. Kagel - 10 Jan 2006 02:32 GMT
>>> Any ideas how to speed this up?
>>
[quoted text clipped - 11 lines]
> answer the question "which article is in customer_article but not in
> customer_article_bak" :)

Oh, but your original post says:
"Show me all new articles but only if the customer exists in the old table"

Misinterpreted 'new' to mean articles in the 'new' table.  In that case:

select new.*
from customer_article as new
where exists (
    select 1
    from customer_article_bak as old
    where old.customer = new.customer
) and not exists (
    select 1
    from customer_article_bak as old
    where old.customer = new.customer
      and old.article = new.article
);

Art S. Kagel
Phil Sherman - 10 Jan 2006 13:39 GMT
The first "exists" clause is potentially an issue because there is no
easy way to verify that a specific customer exists. This should cause
either a scan and sort of both tables (possibly one table and 1 index)
or multiple scans of customer_article_bak.

Try adding an index on customer_article_bak(customer).

Why do you want to notify the customers only when there are new items?
From an application view, what do you do when an item is eliminated or
the price is changed?

If your application accesses the data by customer, then you may want to
 add an index customer_article(customer,article)

If there are really only three columns in the table, then you can carry
the price as additional data in the index. This increates the size of
the index, will slow updates to price, and will support index only
access to the data, improving read performance.

Phil Sherman

>>>> Any ideas how to speed this up?
>>>
[quoted text clipped - 31 lines]
>
> Art S. Kagel
Bernd Hohmann - 10 Jan 2006 14:16 GMT
> Try adding an index on customer_article_bak(customer).

This index exists [customer,article] so the index is ordered by customer
first.

> Why do you want to notify the customers only when there are new items?
> From an application view, what do you do when an item is eliminated or
> the price is changed?

In reality the table contains prices, rebaites, conditions for volume
sales and so on. Items are eliminated 2 years after the article is
withdrawn, after this time nobody needs to know about this (decision
made by my customer).

Bernd
Bernd Hohmann - 10 Jan 2006 14:19 GMT
[...]

> Misinterpreted 'new' to mean articles in the 'new' table.  In that case:

After 1 hour runtime without any result I stopped the query.

Any other ideas to speedup this stuff?

A very primitive variant will be: exporting both tables ordered by
customer,article in two sequential files and comparing them by reading
line for line - which will take some minutes but not hours like DB2 needs.

Serge, Knut?

Bernd

Signature

In den USA war die letzte Zigarette, die man dem zum Tode Verurteilten
gewährte, eine Tradition, ein angestammtes Recht, bis vor kurzem der
Gouverneur von Alabama diese verbot - und zwar aus gesundheitlichen Gründen.

--CELKO-- - 10 Jan 2006 16:54 GMT
>> There is a table "customer_article_bak" with the same structure but
contains the situation before the data update. <<

Your design flaw is called "attribute splitting"; you have the same
entity in more than one table, where the tables are the values of some
attribute -- this looks like purchase date.  You also cannot use
floating point numbers for money, you have no keys, etc, You are
mimicking a tape file, not an RDBMS.

Please post real DDL in the future; my guess is that youy meant:

CREATE TABLE CustomerArticles
(customer_id CHAR(5) NOT NULL,
article_id CHAR(5) NOT NULL,
purchase_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,
article_price DECIMAL (7,2) NOT NULL
  CHECK (article_price > 0.00),
PRIMARY KEY(customer_id, article_id, purchase_date));

>> Now I like a report like "Show me all new articles but only if the customer exists in the old table" (means: only existing customers get a hint which article are new to them). <<

SELECT C1.customer_id, C1.article_id AS new_article
 FROM CustomerArticles AS C1
WHERE NOT EXISTS
     (SELECT *
        FROM CustomerArticles AS C2
       WHERE C1.customer_id = C2.customer_id
         AND C1.article_id = C2.article_id
         AND C1.purchase_date > C2.purchase_date);
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.