Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / Oracle / Oracle Server / August 2005

Tip: Looking for answers? Try searching our database.

group by clause filling up temp space

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
ewong - 30 Aug 2005 20:11 GMT
I have this query which filled up 12GB of temp space.  tableA has 500K
records where tableB has 5M records.  sid are indexed.  The 2 tables
combine only 200 mega bytes and yet this query needs more than 12GB of
temp space to perform.  I am thinking it's the multiple group by that
is causing trouble?  Is there any workaround on this?  Thanks.

SELECT a.id, a.type1, a.type2, count(b.sid)
FROM tableA a, tableB b
WHERE a.sid = b.sid
and b.status = 'Y'
GROUP BY a.id, a.type1, a.type2;

The database is 9.2 EE.  The involved tables are analyzed.
Daniel Fink - 30 Aug 2005 20:25 GMT
Add AND 0 = 1 to the predicate...

Seriously, it is hard to say. At the very least post a full explain
plan (including bytes/card column values).

Is this the only query on the system?
ewong - 30 Aug 2005 20:35 GMT
Here is the explain plan.  I've tried 0=1 it doesn't help.  And yeah
this is the only query that fills up temp.  Thanks.

Execution Plan
----------------------------------------------------------
  0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3162293791 Card=1847
         2014 Bytes=664992504)

  1    0   SORT (GROUP BY) (Cost=3162293791 Card=18472014 Bytes=66499
         2504)

  2    1     HASH JOIN (Cost=2480 Card=193301902913 Bytes=69588685048
         68)

  3    2       TABLE ACCESS (FULL) OF 'TABLEA' (Cost=727 Card=2
         48875 Bytes=7964000)

  4    2       TABLE ACCESS (FULL) OF 'TABLEB' (Cost=644
          Card=77
9648 Bytes=3118592)
Daniel Fink - 30 Aug 2005 20:56 GMT
Look at the BYTES and CARD of the HASH JOIN. A quick look tells me that
these numbers are way too huge! The number of bytes estimated for the
hash join is 6,958,868,504,868 (roughly 6.5 terabytes!).

My first thought is that a NESTED_LOOP would be a better operation. But
the optimizer is not choosing it, why? Are there indexes on a.sid and
b.sid? If not, you might look at them to see if a better execution plan
is the result.
fitzjarrell@cox.net - 30 Aug 2005 21:17 GMT
> Here is the explain plan.  I've tried 0=1 it doesn't help.  And yeah
> this is the only query that fills up temp.  Thanks.
[quoted text clipped - 16 lines]
>            Card=77
> 9648 Bytes=3118592)

You should seriously consider tracing the session executing this query,
as that is likely the best avenue you have to discover what exactly is
happening.  Also, you state the tables are analyzed, yet you make no
mention of how current the statistics are.  I would suggest you use
dbms_stats to compute the statistics rather than 'analyze table'; you
may also need histograms for table b as it may have a skewed
distribution with respect to the status value.  You've already
mentioned the SID column is indexed for both tables, however you're not
using them according to the plan generated; this may be due to
stale/incomplete statistics.  Again I suggest using
dbms_Stats.gather_Table_stats() to calculate your statistics, as
'analyze table', an inferior tool, is only provided for backward
compatibility.

What is the value for db_file_multiblock_read_count?  If this is too
large it will favor full table scans over index scans.  Another
parameter you need to examine is optimizer_index_cost_adj; by default
it's set to 100, equating an index scan to a full table scan.  You
might want to adjust this to a lower value (I set it to 15 in the
instances I administer).

Any number of reasons could be causing this problem.  I've provided
several possible areas for adjustment; take each in turn and test how
your query is affected.  My guess is one of the above mentioned
adjustments will improve your situation.

David Fitzjarrell
DA Morgan - 30 Aug 2005 21:33 GMT
> Here is the explain plan.  I've tried 0=1 it doesn't help.  And yeah
> this is the only query that fills up temp.  Thanks.
[quoted text clipped - 16 lines]
>            Card=77
> 9648 Bytes=3118592)

Cost 3162293791? Ouch.

Look at www.psoug.org
click on Morgan's Library
click on GROUP BY

I have one or two alternatives to using the GROUP BY
clause that might help.

Why no indexes with what look to be your primary key columns: sid?
Or are they there and you are still getting FTE?
Signature

Daniel A. Morgan
http://www.psoug.org
damorgan@x.washington.edu
(replace x with u to respond)

Jaap W. van Dijk - 30 Aug 2005 22:13 GMT
>I have this query which filled up 12GB of temp space.  tableA has 500K
>records where tableB has 5M records.  sid are indexed.  The 2 tables
[quoted text clipped - 9 lines]
>
>The database is 9.2 EE.  The involved tables are analyzed.

Firstly: how did you conclude that this query takes up so much temp
space?

Secondly: suppose two extreme scenarios:
1) sid's are unique in both tables. The hash-join would result in 500k
records at the most.
2) both tables have only one equal value for sid. The hash-join would
be equal to an cartesian product, which would produce 500k * 5M = 2.5
TB records.

So if your value distribution leans to the second scenario,  your hash
join, together with the group by sort,  would easily get 12 GB filled,
before the group by itself would make this smaller.

Jaap.
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.