Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / DB2 Topics / January 2006

Tip: Looking for answers? Try searching our database.

runstats sampling oddities

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
kenfar - 26 Jan 2006 03:33 GMT
I've got a large table on db2 8.2.1 that I rarely perform runstats on.
It has about 600 million rows organized in a single MDC time dimension
on a non-dpf warehouse.

Anyhow, we recently ran runstats on it with a 20% sampling.  After this
was performed the highly selective queries using the time dimension
slowed down drastically (8000% increase in duration).  Earlier today I
marked the table 'volatile' and queries went back to normal.  Tomorow
I'll do a non-sampling runstats - it'll take hours, and remove the
volatile flag.

I'm not sure where the delay was exactly - visual explain wasn't
helpful, nmon showed fine io throughput, but application snapshots
showed rows read & logical/physical reads just crawling.

What's going on here?  What was db2 up to - acquiring incorrect blocks
of data & throwing results away?   Is this a bug?  Or am I missing some
limitations or issues with runstats (or runstats & MDC)?

And yes, I will be applying fp10 as soon as fp11 comes out. :-)

Thanks in advance,

Ken Farmer
Mark A - 26 Jan 2006 03:49 GMT
> I've got a large table on db2 8.2.1 that I rarely perform runstats on.
> It has about 600 million rows organized in a single MDC time dimension
[quoted text clipped - 20 lines]
>
> Ken Farmer

On a table that large, once you execute runstats and everything seems to be
running as expected, then there is usually no good reason to run it again.
If you take that approach, then doing the non-sampling runstats should not
be a burden.
Serge Rielau - 26 Jan 2006 08:27 GMT
> I've got a large table on db2 8.2.1 that I rarely perform runstats on.
> It has about 600 million rows organized in a single MDC time dimension
[quoted text clipped - 16 lines]
>
> And yes, I will be applying fp10 as soon as fp11 comes out. :-)
There is an easy way to find out what happened: db2exfmt.
Note to know change its good to know the before picture as well.
There MUST have been a plan change.
Also since you are obviously not playing with a toy system I recommend
that you use db2exfmt. Visual explain is nice, but db2exfmt leaves thie
toy in the dust.
EXPLAIN PLAN FOR querystillgood
db2exfmt -d <dbname> -o good.exfmt -1
do your runstats (or flip of volatile, or whatever)
EXPLAIN PLAN FOR querynowbad
db2exmft -d <dbname> -o bad.exfmt -1

Then let's have a look. The problem will be staring in our faces.

Cheers
Serge
Signature

Serge Rielau
DB2 Solutions Development
DB2 UDB for Linux, Unix, Windows
IBM Toronto Lab

kenfar - 26 Jan 2006 17:47 GMT
I've got two scenarios then for this simple query:
  good:
     - volatile
     - stats collected with 20% sampling
     - takes about 35 seconds to export 1.9 m rows
  bad:
     - non-volatile
     - stats collected with 20% sampling
     - takes about 60 minutes to export 1.9 m rows

******************************************************************
Here's the good explain:
******************** EXPLAIN INSTANCE ********************

DB2_VERSION:            08.02.1
SOURCE_NAME:            SQLC2E06
SOURCE_SCHEMA:          NULLID
SOURCE_VERSION:
EXPLAIN_TIME:           2006-01-26-16.55.05.287940
EXPLAIN_REQUESTER:      DBA

Database Context:
----------------
       Parallelism:            Intra-Partition Parallelism
       CPU Speed:              5.668131e-07
       Comm Speed:             2
       Buffer Pool size:       17000
       Sort Heap size:         10000
       Database Heap size:     8000
       Lock List size:         100
       Maximum Lock List:      10
       Average Applications:   1
       Locks Available:        640

Package Context:
---------------
       SQL Type:               Dynamic
       Optimization Level:     5
       Blocking:               Block All Cursors
       Isolation Level:        Cursor Stability

---------------- STATEMENT 1  SECTION 203 ----------------
       QUERYNO:                1
       QUERYTAG:
       Statement Type:         Select
       Updatable:              No
       Deletable:              No
       Query Degree:           -1

Original Statement:
------------------
SELECT 1
FROM sware.mf_trm_nid_3
WHERE sensor_time_id = 1485

Optimized Statement:
-------------------
SELECT 1
FROM SWARE.MF_TRM_NID_3 AS Q1
WHERE (Q1.SENSOR_TIME_ID = 1485)

Access Plan:
-----------
       Total Cost:             1.25461e+07
       Query Degree:           4

            Rows
           RETURN
           (   1)
            Cost
             I/O
             |
           853509
           LTQ
           (   2)
         1.25461e+07
           508785
             |
           853509
           FETCH
           (   3)
         1.25457e+07
           508785
          /---+---\
     236.513    6.38425e+08
     IXSCAN   TABLE: SWARE
     (   4)    MF_TRM_NID_3
     25.0285
        1
       |
   6.38425e+08
 INDEX: SYSIBM
SQL0402251807242

       1) RETURN: (Return Result)
               Cumulative Total Cost:          1.25461e+07
               Cumulative CPU Cost:            6.17343e+09
               Cumulative I/O Cost:            508785
               Cumulative Re-Total Cost:       1.25457e+07
               Cumulative Re-CPU Cost:         5.5809e+09
               Cumulative Re-I/O Cost:         508785
               Cumulative First Row Cost:      78.0764
               Estimated Bufferpool Buffers:   508786

               Arguments:
               ---------
               BLDLEVEL: (Build level)
                       DB2 v8.1.1.80 : s041221
               STMTHEAP: (Statement heap size)
                       4096

               Input Streams:
               -------------
                       5) From Operator #2

                               Estimated number of rows:       853509
                               Number of columns:              1
                               Subquery predicate ID:          Not
Applicable

                               Column Names:
                               ------------
                               +Q2.$C0

       2) TQ    : (Table Queue)
               Cumulative Total Cost:          1.25461e+07
               Cumulative CPU Cost:            6.17343e+09
               Cumulative I/O Cost:            508785
               Cumulative Re-Total Cost:       1.25457e+07
               Cumulative Re-CPU Cost:         5.5809e+09
               Cumulative Re-I/O Cost:         508785
               Cumulative First Row Cost:      78.0764
               Estimated Bufferpool Buffers:   508786

               Arguments:
               ---------
               LISTENER: (Listener Table Queue type)
                       FALSE
               TQ TYPE : (Table queue type)
                       LOCAL
               TQDEGREE: (Degree of Intra-Partition parallelism)
                       4
               TQMERGE : (Merging Table Queue flag)
                       FALSE
               TQREAD  : (Table Queue Read type)
                       READ AHEAD
               UNIQUE  : (Uniqueness required flag)
                       FALSE

               Input Streams:
               -------------
                       4) From Operator #3

                               Estimated number of rows:       853509
                               Number of columns:              1
                               Subquery predicate ID:          Not
Applicable

                               Column Names:
                               ------------
                               +Q2.$C0

               Output Streams:
               --------------
                       5) To Operator #1

                               Estimated number of rows:       853509
                               Number of columns:              1
                               Subquery predicate ID:          Not
Applicable

                               Column Names:
                               ------------
                               +Q2.$C0

       3) FETCH : (Fetch)
               Cumulative Total Cost:          1.25457e+07
               Cumulative CPU Cost:            5.58093e+09
               Cumulative I/O Cost:            508785
               Cumulative Re-Total Cost:       1.25457e+07
               Cumulative Re-CPU Cost:         5.5809e+09
               Cumulative Re-I/O Cost:         508785
               Cumulative First Row Cost:      77.9839
               Estimated Bufferpool Buffers:   508786

               Arguments:
               ---------
               BLKLOCK : (Block Lock intent)
                       INTENT SHARE
               MAXPAGES: (Maximum pages for prefetch)
                       1
               MAXPAGES: (Maximum pages for prefetch)
                       1
               PREFETCH: (Type of Prefetch)
                       NONE
               ROWLOCK : (Row Lock intent)
                       NEXT KEY SHARE
               TABLOCK : (Table Lock intent)
                       INTENT SHARE
               TBISOLVL: (Table access Isolation Level)
                       CURSOR STABILITY

               Input Streams:
               -------------
                       2) From Operator #4

                               Estimated number of rows:       236.513
                               Number of columns:              0
                               Subquery predicate ID:          Not
Applicable

                       3) From Object SWARE.MF_TRM_NID_3

                               Estimated number of rows:
6.38425e+08
                               Number of columns:              1
                               Subquery predicate ID:          Not
Applicable

                               Column Names:
                               ------------
                               +Q1.$RID$

               Output Streams:
               --------------
                       4) To Operator #2

                               Estimated number of rows:       853509
                               Number of columns:              1
                               Subquery predicate ID:          Not
Applicable

                               Column Names:
                               ------------
                               +Q2.$C0

       4) IXSCAN: (Index Scan)
               Cumulative Total Cost:          25.0285
               Cumulative CPU Cost:            50245.5
               Cumulative I/O Cost:            1
               Cumulative Re-Total Cost:       0.00957301
               Cumulative Re-CPU Cost:         16889.2
               Cumulative Re-I/O Cost:         0
               Cumulative First Row Cost:      25.0149
               Estimated Bufferpool Buffers:   2

               Arguments:
               ---------
               BLKLOCK : (Block Lock intent)
                       INTENT SHARE
               MAXPAGES: (Maximum pages for prefetch)
                       1
               PREFETCH: (Type of Prefetch)
                       NONE
               ROWLOCK : (Row Lock intent)
                       NEXT KEY SHARE
               SCANDIR : (Scan Direction)
                       FORWARD
               SCANGRAN: (Intra-Partition Parallelism Scan
Granularity)
                       1
               SCANTYPE: (Intra-Partition Parallelism Scan Type)
                       LOCAL PARALLEL
               TABLOCK : (Table Lock intent)
                       INTENT SHARE
               VOLATILE: (Volatile type)
                       CARDINALITY

               Predicates:
               ----------
               2) Start Key Predicate
                       Relational Operator:            Equal (=)
                       Subquery Input Required:        No
                       Filter Factor:                  0.0013369

                       Predicate Text:
                       --------------
                       (Q1.SENSOR_TIME_ID = 1485)

               2) Stop Key Predicate
                       Relational Operator:            Equal (=)
                       Subquery Input Required:        No
                       Filter Factor:                  0.0013369

                       Predicate Text:
                       --------------
                       (Q1.SENSOR_TIME_ID = 1485)

               Input Streams:
               -------------
                       1) From Object SYSIBM.SQL040225180724200

                               Estimated number of rows:
6.38425e+08
                               Number of columns:              2
                               Subquery predicate ID:          Not
Applicable

                               Column Names:
                               ------------
                               +Q1.$BLOCKID$+Q1.SENSOR_TIME_ID

               Output Streams:
               --------------
                       2) To Operator #3

                               Estimated number of rows:       236.513
                               Number of columns:              0
                               Subquery predicate ID:          Not
Applicable

Objects Used in Access Plan:
---------------------------

       Schema: SYSIBM
       Name:   SQL040225180724200
       Type:   Index
                       Time of creation:
2004-02-25-18.07.24.131547
                       Last statistics update:
2006-01-17-19.15.49.725655
                       Number of columns:              1
                       Number of rows:                 638424927
                       Width of rows:                  -1
                       Number of buffer pool pages:    5661216
                       Distinct row values:            No
                       Tablespace name:                TS_INDEXES
                       Tablespace overhead:            24.100000
                       Tablespace transfer rate:       0.900000
                       Source for statistics:          Single Node
                       Prefetch page count:            192
                       Container extent page count:    32
                       Index clustering statistic:     100.000000
                       Index leaf pages:               40
                       Index tree levels:              2
                       Index full key cardinality:     747
                       Index first key cardinality:    747
                       Index first 2 keys cardinality: -1
                       Index first 3 keys cardinality: -1
                       Index first 4 keys cardinality: -1
                       Index sequential pages:         17
                       Index page density:             47
                       Index avg sequential pages:     17
                       Index avg gap between sequences:0
                       Index avg random pages:         20
                       Fetch avg sequential pages:     -1
                       Fetch avg gap between sequences:-1
                       Fetch avg random pages:         -1
                       Index RID count:                176914
                       Index deleted RID count:        2
                       Index empty leaf pages:         0
                       Base Table Schema:              SWARE
                       Base Table Name:                MF_TRM_NID_3
                       Columns in index:
                               SENSOR_TIME_ID

       Schema: SWARE
       Name:   MF_TRM_NID_3
       Type:   Table
                       Time of creation:
2004-02-25-18.07.24.131547
                       Last statistics update:
2006-01-17-19.15.49.725655
                       Number of columns:              38
                       Number of rows:                 638424927
                       Width of rows:                  20
                       Number of buffer pool pages:    5661216
                       Distinct row values:            No
                       Tablespace name:                TS_FACT1
                       Tablespace overhead:            24.100000
                       Tablespace transfer rate:       0.900000
                       Source for statistics:          Single Node
                       Prefetch page count:            128
                       Container extent page count:    32
                       Table overflow record count:    374897346
                       Table Active Blocks:            176912

******************************************************************
and here's the bad one
******************** EXPLAIN INSTANCE ********************

DB2_VERSION:            08.02.1
SOURCE_NAME:            SQLC2E06
SOURCE_SCHEMA:          NULLID
SOURCE_VERSION:
EXPLAIN_TIME:           2006-01-26-17.40.19.135670
EXPLAIN_REQUESTER:      DBA

Database Context:
----------------
       Parallelism:            Intra-Partition Parallelism
       CPU Speed:              5.668131e-07
       Comm Speed:             2
       Buffer Pool size:       17000
       Sort Heap size:         10000
       Database Heap size:     8000
       Lock List size:         100
       Maximum Lock List:      10
       Average Applications:   1
       Locks Available:        640

Package Context:
---------------
       SQL Type:               Dynamic
       Optimization Level:     5
       Blocking:               Block All Cursors
       Isolation Level:        Cursor Stability

---------------- STATEMENT 1  SECTION 203 ----------------
       QUERYNO:                1
       QUERYTAG:
       Statement Type:         Select
       Updatable:              No
       Deletable:              No
       Query Degree:           -1

Original Statement:
------------------
SELECT 1
FROM sware.mf_trm_nid_3
WHERE sensor_time_id = 1485

Optimized Statement:
-------------------
SELECT 1
FROM SWARE.MF_TRM_NID_3 AS Q1
WHERE (Q1.SENSOR_TIME_ID = 1485)

Access Plan:
-----------
       Total Cost:             9.53175e+06
       Query Degree:           4

     Rows
    RETURN
    (   1)
     Cost
      I/O
      |
    853509
    LTQ
    (   2)
  9.53175e+06
  5.66119e+06
      |
    853509
    TBSCAN
    (   3)
  9.53141e+06
  5.66119e+06
      |
  6.38425e+08
TABLE: SWARE
 MF_TRM_NID_3

       1) RETURN: (Return Result)
               Cumulative Total Cost:          9.53175e+06
               Cumulative CPU Cost:            3.05352e+11
               Cumulative I/O Cost:            5.66119e+06
               Cumulative Re-Total Cost:       9.53141e+06
               Cumulative Re-CPU Cost:         3.0476e+11
               Cumulative Re-I/O Cost:         5.66119e+06
               Cumulative First Row Cost:      64.8634
               Estimated Bufferpool Buffers:   5.66118e+06

               Arguments:
               ---------
               BLDLEVEL: (Build level)
                       DB2 v8.1.1.80 : s041221
               STMTHEAP: (Statement heap size)
                       4096

               Input Streams:
               -------------
                       3) From Operator #2

                               Estimated number of rows:       853509
                               Number of columns:              1
                               Subquery predicate ID:          Not
Applicable

                               Column Names:
                               ------------
                               +Q2.$C0

       2) TQ    : (Table Queue)
               Cumulative Total Cost:          9.53175e+06
               Cumulative CPU Cost:            3.05352e+11
               Cumulative I/O Cost:            5.66119e+06
               Cumulative Re-Total Cost:       9.53141e+06
               Cumulative Re-CPU Cost:         3.0476e+11
               Cumulative Re-I/O Cost:         5.66119e+06
               Cumulative First Row Cost:      64.8634
               Estimated Bufferpool Buffers:   5.66118e+06

               Arguments:
               ---------
               LISTENER: (Listener Table Queue type)
                       FALSE
               TQ TYPE : (Table queue type)
                       LOCAL
               TQDEGREE: (Degree of Intra-Partition parallelism)
                       4
               TQMERGE : (Merging Table Queue flag)
                       FALSE
               TQREAD  : (Table Queue Read type)
                       READ AHEAD
               UNIQUE  : (Uniqueness required flag)
                       FALSE

               Input Streams:
               -------------
                       2) From Operator #3

                               Estimated number of rows:       853509
                               Number of columns:              1
                               Subquery predicate ID:          Not
Applicable

                               Column Names:
                               ------------
                               +Q2.$C0

               Output Streams:
               --------------
                       3) To Operator #1

                               Estimated number of rows:       853509
                               Number of columns:              1
                               Subquery predicate ID:          Not
Applicable

                               Column Names:
                               ------------
                               +Q2.$C0

       3) TBSCAN: (Table Scan)
               Cumulative Total Cost:          9.53141e+06
               Cumulative CPU Cost:            3.0476e+11
               Cumulative I/O Cost:            5.66119e+06
               Cumulative Re-Total Cost:       9.53141e+06
               Cumulative Re-CPU Cost:         3.0476e+11
               Cumulative Re-I/O Cost:         5.66119e+06
               Cumulative First Row Cost:      64.7709
               Estimated Bufferpool Buffers:   5.66118e+06

               Arguments:
               ---------
               BLKLOCK : (Block Lock intent)
                       INTENT SHARE
               MAXPAGES: (Maximum pages for prefetch)
                       ALL
               PREFETCH: (Type of Prefetch)
                       SEQUENTIAL
               ROWLOCK : (Row Lock intent)
                       NEXT KEY SHARE
               SCANDIR : (Scan Direction)
                       FORWARD
               SCANGRAN: (Intra-Partition Parallelism Scan
Granularity)
                       32
               SCANTYPE: (Intra-Partition Parallelism Scan Type)
                       LOCAL PARALLEL
               SCANUNIT: (Intra-Partition Parallelism Scan Unit)
                       PAGE
               TABLOCK : (Table Lock intent)
                       INTENT SHARE
               TBISOLVL: (Table access Isolation Level)
                       CURSOR STABILITY

               Predicates:
               ----------
               2) Block Sarg Predicate
                       Relational Operator:            Equal (=)
                       Subquery Input Required:        No
                       Filter Factor:                  0.0013369

                       Predicate Text:
                       --------------
                       (Q1.SENSOR_TIME_ID = 1485)

               Input Streams:
               -------------
                       1) From Object SWARE.MF_TRM_NID_3

                               Estimated number of rows:
6.38425e+08
                               Number of columns:              2
                               Subquery predicate ID:          Not
Applicable

                               Column Names:
                               ------------
                               +Q1.$RID$+Q1.SENSOR_TIME_ID

               Output Streams:
               --------------
                       2) To Operator #2

                               Estimated number of rows:       853509
                               Number of columns:              1
                               Subquery predicate ID:          Not
Applicable

                               Column Names:
                               ------------
                               +Q2.$C0

Objects Used in Access Plan:
---------------------------

       Schema: SWARE
       Name:   MF_TRM_NID_3
       Type:   Table
                       Time of creation:
2004-02-25-18.07.24.131547
                       Last statistics update:
2006-01-17-19.15.49.725655
                       Number of columns:              38
                       Number of rows:                 638424927
                       Width of rows:                  20
                       Number of buffer pool pages:    5661216
                       Distinct row values:            No
                       Tablespace name:                TS_FACT1
                       Tablespace overhead:            24.100000
                       Tablespace transfer rate:       0.900000
                       Source for statistics:          Single Node
                       Prefetch page count:            128
                       Container extent page count:    32
                       Table overflow record count:    374897346
                       Table Active Blocks:            176912
Phil Sherman - 28 Jan 2006 14:42 GMT
I can see a major difference just looking at the access plans. (This
doesn't include a full analysis of the detail information available.)
Your "good" run used an index to locate rows that satisfied the
predicate. The "bad" one used a tablespace scan to read the entire table
looking for qualifying rows.

This difference should have been easily seen in a visual explain. If you
still have the two statements explained, then I'd go back to the visual
tool and see if you can spot the difference.

The optimizer's decision to scan or use an index uses, as one of its
factors, its estimate of the total number of I/Os needed to satisfy the
query. Scanning always reads multiple pages with each I/O; indexes can
be assumed to read one data page per row. The optimizer incorrectly
(based on your actual results) assumed around 850k rows would be
retrieved. I'd guess that the partial statistics you gathered tipped the
optimizer into deciding that the tablespace scan would be faster. The
VOLATILE table setting will cause the optimizer to use the index even
though it would prefer to use a tablespace scan. (See ALTER TABLE in SQL
Reference V2.)

My experience with large tables and this type of query is that index
access becomes almost a certainty when the table is clustered to match
the index. The statistics must also indicate that the table is currently
clustered and not in need of reorganization.

Phil Sherman

> I've got two scenarios then for this simple query:
>    good:
[quoted text clipped - 79 lines]
>  TABLE: SWARE
>   MF_TRM_NID_3
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.