>I have a big table (20M rows) whose primary key is composed of 2
> columns. The other 10 columns are all double values. The queries on
[quoted text clipped - 8 lines]
>
> Thanks!
1. What is the ratio of selects, versus inserts/updates/deletes.
2. How may rows are typically accessed in each select statement (assuming we
are talking about the selects where you want to see all the columns).
3. How large is your bufferpool(s)
Henry J. - 29 Sep 2008 03:52 GMT
> >I have a big table (20M rows) whose primary key is composed of 2
> > columns. The other 10 columns are all double values. The queries on
[quoted text clipped - 13 lines]
> are talking about the selects where you want to see all the columns).
> 3. How large is your bufferpool(s)
During each batch run, the table is truncated and then populated, and
finally queried (by aggregation type of queries). There is no update
or delete.
The queries can be queried by several jobs each accessing different
number of rows (varied from 10% to 80%).
Not sure about the bufferpool.
To benefit from putting all columns into "include", what should I ask
our DBAs to do? Increate table space? Bufferpool? Please advise.
Thanks!
Mark A - 29 Sep 2008 03:58 GMT
> To benefit from putting all columns into "include", what should I ask
> our DBAs to do? Increate table space? Bufferpool? Please advise.
>
>Thanks!
It probably will help your select performance, but it will take up more disk
space (which I assume you are OK with).
Henry J. - 29 Sep 2008 04:10 GMT
> > To benefit from putting all columns into "include", what should I ask
> > our DBAs to do? Increate table space? Bufferpool? Please advise.
[quoted text clipped - 3 lines]
> It probably will help your select performance, but it will take up more disk
> space (which I assume you are OK with).
Thanks. I guess the insertion will be slower, right? Let me test it
out.
Henry J. - 29 Sep 2008 04:12 GMT
> > To benefit from putting all columns into "include", what should I ask
> > our DBAs to do? Increate table space? Bufferpool? Please advise.
[quoted text clipped - 3 lines]
> It probably will help your select performance, but it will take up more disk
> space (which I assume you are OK with).
Thanks. I guess insertion will be slower, right?
> I have a big table (20M rows) whose primary key is composed of 2
> columns. The other 10 columns are all double values. The queries on
> the table are mostly aggregating on the value columns.
>
> In order to improve performance, is it a good idea to add all the 10
> value columns to the "include" part of the primary key?
I tried it on a ~100M rows table and it significantly improved
performance (factor 3...10, don't remember)
> I guess the disk usage of the table will double as the value columns
> are duplicated in the index. Is this right? Would the select
> performance be improved significantly?
Disk usage will increase significantly, obviously.
Henry J. - 29 Sep 2008 04:10 GMT
On Sep 28, 2:16 pm, w.l.fisc...@googlemail.com wrote:
> > I have a big table (20M rows) whose primary key is composed of 2
> > columns. The other 10 columns are all double values. The queries on
[quoted text clipped - 11 lines]
>
> Disk usage will increase significantly, obviously.
Thanks.
DA Morgan - 29 Sep 2008 07:24 GMT
>> I have a big table (20M rows) whose primary key is composed of 2
>> columns. The other 10 columns are all double values. The queries on
[quoted text clipped - 11 lines]
>
> Disk usage will increase significantly, obviously.
Doesn't DB2 have a table or index type comparable to Oracle's
Index Organized Table or SQL Server's Cluster Index where that
would not be the case?

Signature
Daniel A. Morgan
University of Washington
damorgan@x.washington.edu (replace x with u to respond)
Mark A - 29 Sep 2008 08:13 GMT
> Doesn't DB2 have a table or index type comparable to Oracle's
> Index Organized Table or SQL Server's Cluster Index where that
> would not be the case?
DB2 has index clustering which can specify the order of the table and
multi-dimensional clustering (MDC), but that affects the physical ordering
of the data, and not whether it exists in the index and the table. In DB2,
the table always has all the columns even if all the columns are also in one
of the indexes, and even if clustering is used. In DB2 clustering (ordering
of the table data) is usually implemented of a subset of the columns, as
defined in one of the indexes, and not on the entire the table (usually).
And in DB2 clustering is not exact, so insert performance is much better
than index ordered tables because there are no page splits and no
rearranging of pages (when a clustered index is defined, DB2 looks for the
best page to place a data row, not necessary the exact sequence in the data
page, and if there is no room on the correct page, then a nearby page, or
eventually any page is chosen for the insert--but MDC is different and bit
more complicated).
By "including" the extra non-unique columns in the unique index (and not
affecting the original uniqueness of the index based on the PK) he will get
index-only access on his selects. The table will not need to be accessed for
selects.
In this particular situation, it will require about 1.6 GB of extra disk
space.
DA Morgan - 29 Sep 2008 13:30 GMT
> In this particular situation, it will require about 1.6 GB of extra disk
> space.
Thank you.
Given the cost of disk these days not something I would expect anyone
to get excited about.

Signature
Daniel A. Morgan
University of Washington
damorgan@x.washington.edu (replace x with u to respond)
Henry J. - 29 Sep 2008 23:35 GMT
> > In this particular situation, it will require about 1.6 GB of extra disk
> > space.
[quoted text clipped - 7 lines]
> University of Washington
> damor...@x.washington.edu (replace x with u to respond)
So the cost of disk is increasing these days? I thought it should
drop over time.
Mark A - 30 Sep 2008 00:02 GMT
> So the cost of disk is increasing these days? I thought it should
> drop over time.
He said it is not enough space to get excited about (not enough to worry
about).
Serge Rielau - 30 Sep 2008 11:51 GMT
> Doesn't DB2 have a table or index type comparable to Oracle's
> Index Organized Table or SQL Server's Cluster Index where that
> would not be the case?
I do not know SQL Server's Cluster Index, can't comment there.
DB2 does not have an "index only table".
There are two table types that have "some" overlap:
* Range Clustered Tables (RCT). These are essentially arrays on disk.
Typically used for for static look up tables
* Multi dimensional clustering. (MDC)
Organizing the data in an n-dimensional cube.
The individual blcoks are indexed instead of the rows and located by
intersecting (or or-ing) the dimensions. As a result block indexes are
tiny with one entry potentially covering thousands of rows.
So in a way both these approaches are "index-less" tables. The end
result is the same: Fast access without duplication of data.
Cheers
Serge

Signature
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab