Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / Informix Topics / January 2004

Tip: Looking for answers? Try searching our database.

Hash Tables

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Wilkinson, Marcus - 28 Jan 2004 23:36 GMT
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C3E5F7.95E57D11
Content-Type: text/plain

Hi,

Could anyone explain to me exactly what happens when you perform a hash join
between multiple tables.   The Informix doc I have read only references a
two table join.  It says that a hash table is built of the driving table and
matched against hash values of the second table.   What happens if there is
a third and fourth table?

Would more than one hash table be built?

How can I determine how big the hash table(s) will be and how much memory
they will consume.

Thanks for your help.

Marcus Wilkinson

------_=_NextPart_001_01C3E5F7.95E57D11
Content-Type: text/html

<html>

<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">

<meta name=Generator content="Microsoft Word 11 (filtered)">
<style>
<!--
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
    {margin:0cm;
    margin-bottom:.0001pt;
    font-size:12.0pt;
    font-family:"Times New Roman";}
a:link, span.MsoHyperlink
    {color:blue;
    text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
    {color:purple;
    text-decoration:underline;}
p
    {margin-right:0cm;
    margin-left:0cm;
    font-size:12.0pt;
    font-family:"Times New Roman";}
span.EmailStyle17
    {font-family:Arial;
    color:windowtext;}
@page Section1
    {size:595.3pt 841.9pt;
    margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
    {page:Section1;}
-->
</style>

</head>

<body lang=EN-GB link=blue vlink=purple>

<div class=Section1>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>Hi,</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>&nbsp;</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>Could anyone explain to me exactly what happens when you
perform a hash join between multiple tables.&nbsp;&nbsp; The Informix doc I
have read only references a two table join.&nbsp; It says that a hash table is
built of the driving table and matched against hash values of the second
table.&nbsp;&nbsp; What happens if there is a third and fourth table?</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>&nbsp;</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>Would more than one hash table be built?</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>&nbsp;</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>How can I determine how big the hash table(s) will be and
how much memory they will consume.</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>&nbsp;</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>Thanks for your help.</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>&nbsp;</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>&nbsp;</span></font></p>

<p style='margin:0cm;margin-bottom:.0001pt'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'>Marcus Wilkinson <br>
<br>
</span></font></p>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>&nbsp;</span></font></p>

</div>

</body>

</html>

------_=_NextPart_001_01C3E5F7.95E57D11--
sending to informix-list
sending to informix-list
Curtis Crowson - 29 Jan 2004 19:34 GMT
> Could anyone explain to me exactly what happens when you perform a hash join
> between multiple tables.   The Informix doc I have read only references a
> two table join.  It says that a hash table is built of the driving table and
> matched against hash values of the second table.   What happens if there is
> a third and fourth table?

This is how I under stand it. It may be wrong because I haven't seen
any documentation to support it. ( I guess it would be better to know
the answer when answering a question, but it doesn't seem to bother me
at all. ;-) )

Normal index access happens in C * log of A( N ) time on average.
Where A is the number of keys in each leaf ( index page )and N is the
number of records in the table. C is of course the constant for how
long it takes to look at each page. So for a nested loop join of 2
1000 row tables returning 100 rows in the driver table you have a cost
of 100 * C log of A*(N ) for the join.

Now what is going on for hash joins and why it is a good thing for
large joins.

the cost of calculating a hash on a table on a given table is c*N. The
cost of calculating a hash on two tables assuming two threads is still
c*N. What happens for a hash join is that informix creates a temp
memory area where the join key is hashed into a slot with the address
of the record on the device. Then it calculates the hash for the other
table and adds this to the hash table. Joined keys will of course hash
into the same slot so all informix has to do at this point is read the
hash table because joined columns will all be in the same slot. I
would think that this would take 3*c*N which is smaller than 2* NJ * C
* log of A( N ) for some N and NJ. Where NJ is the number of join
records.

Please correct me if you know for sure.

so I suspect the size of the hash table would be keysize and device
address size * ( rows in table one and rows in table two )
Jack Parker - 30 Jan 2004 12:25 GMT
hash size is purportedly 32 + keysize + rowsize (of the selected data).

Interesting formula you've got there, I'll have to take a closer look.

cheers
j.

----- Original Message -----
From: "Curtis Crowson" <curtis@crowson1.com>
To: <informix-list@iiug.org>
Sent: Thursday, January 29, 2004 3:34 PM
Subject: Re: Hash Tables

> > Could anyone explain to me exactly what happens when you perform a hash
join
> > between multiple tables.   The Informix doc I have read only references
a
> > two table join.  It says that a hash table is built of the driving table
and
> > matched against hash values of the second table.   What happens if there
is
> > a third and fourth table?
>
[quoted text clipped - 29 lines]
> so I suspect the size of the hash table would be keysize and device
> address size * ( rows in table one and rows in table two )

sending to informix-list
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.