ok, the book I am reading discusses this type of query:
SELECT p.pname
FROM P, SH, S
WHERE P.pnum = SH.pnum
AND SH.snum = S.snum
AND S.city = 'NY';
and it compares whether to do the full join first then select, or the
other way round.
Anyway the example first joins S and SH over the common column to form
a table TempA, and then sorts the table? The sorting actually takes up
most of the time >80% of total cost.
Is sorting of a temporary table a necessary step in an execution plan?
-CELKO- - 21 Aug 2008 23:02 GMT
>> Is sorting of a temporary table a necessary step in an execution plan? <<
No, many products use hashing instead of sequential access and
merges.
I would also have done this query like so in a conventional SQL engine
built on top of a file system with indexes:
1) Build temp table using (S.city = 'NY') -- should be pretty small
subset, but if it is not indexed, you may have to scan the table.
2) (SH.snum = S.snum) with an index on what looks like PK-FK. In some
products that relationship would be in a linked list structure and be
really fast.
3) (P.pnum = SH.pnum) with an index
In short, the answer is that it all depends...
Darren - 21 Aug 2008 23:46 GMT
> Is sorting of a temporary table a necessary step in an execution plan?
Found this on DB2 site regarding sorting decisions:
"When it evaluates a nested loop join, the optimizer also decides
whether to sort the outer table before performing the join. If it
orders the outer table, based on the join columns, the number of read
operations to access pages from disk for the inner table might be
reduced, because they are more likely to be be in the buffer pool
already. If the join uses a highly clustered index to access the inner
table and if the outer table has been sorted, the number of index
pages accessed might be minimized.
In addition, if the optimizer expects that the join will make a later
sort more expensive, it might also choose to perform the sort before
the join. A later sort might be required to support a GROUP BY,
DISTINCT, ORDER BY or merge join."
http://publib.boulder.ibm.com/infocenter/db2luw/v8/topic/com.ibm.db2.udb.doc/adm
in/c0005311.htm
Tim X - 22 Aug 2008 05:13 GMT
> ok, the book I am reading discusses this type of query:
>
[quoted text clipped - 12 lines]
>
> Is sorting of a temporary table a necessary step in an execution plan?
As is often the case, it depends on the database engine. There is no 'for all
cases do x then y'.
There is an interesting book called "databases for smarties' or
something similar, which shows how much variation you can get in sql
statements on different database engines. In some cases, even the order
of the components of the where clause can have significant differences
in either performance or results (particularly with respect to outer
joins apparently). these differences can exist even when the different
engines supposedly support the same SQL standard.
Tim

Signature
tcross (at) rapttech dot com dot au