Database Forum / DB2 Topics / September 2006
Problem with DB2_USE_ALTERNATE_PAGE_CLEANING
|
|
Thread rating:  |
Hemant Shah - 07 Sep 2006 22:22 GMT Folks,
I spent better part of morning debugging a problem and it turned out to be DB2_USE_ALTERNATE_PAGE_CLEANING registry variable.
I am running DB2 8.2 on AIX 5.3 system:
# oslevel 5.3.0.0
# db2level DB21085I Instance "db2test" uses "32" bits and DB2 code release "SQL08023" with level identifier "03040106". Informational tokens are "DB2 v8.1.1.96", "s050811", "U803920", and FixPak "10". Product is installed at "/usr/opt/db2_08_01".
I had set DB2_USE_ALTERNATE_PAGE_CLEANING to see if it made any difference in performance, and forgot about it because aour test environment was not being used for several weeks. Today I tried to load large amount of data into the database using a C program. It reads a file in propriety format and INSERTs data into the table. It COMMITS data every 5000 rows and if it gets error it will do ROLLBACK and exit.
Some of the smaller tables loaded without any problem, but larger tables got SQL -964 error (transaction log full).
Couple things happened when I had set DB2_USE_ALTERNATE_PAGE_CLEANING to YES:
1) I would get -964 after inserting 30000 rows, I only tested this with one file and it always got -964 somewhere between 30000 and 35000 rows.
2) My last commit is done at 30000th row, then after that I get -964 my program does ROLLBACK and exits. When I run SELECT COUNT(*) on the table I have little more than 32000 rows. How did the extra rows got commited? The table should only have 30000 rows. I start out with an empty table.
When I unset DB2_USE_ALTERNATE_PAGE_CLEANING registry variable, everything works fine.
Is this a bug or a feature?
The DB2 documentation on DB2_USE_ALTERNATE_PAGE_CLEANING does not clarify this.
Here is what I found on IBM site:
DB2_USE_ALTERNATE_PAGE_CLEANING: Default=not set: Values: ON, OFF
Specifies whether DB2 uses the alternate method of page cleaning algorithms instead of the default method of page cleaning. When this variable is set to "ON," DB2 uses a proactive method of page cleaning, writing changed pages to disk, keeping ahead of LSN_GAP, and proactively finding victims. Doing this allows the page cleaners to better utilize available disk I/O bandwidth.
When this variable is set to "ON," the chngpgs_thresh database configuration parameter is no longer relevant because it does not control page cleaner activity.
Thanks for you help.
 Signature Hemant Shah /"\ ASCII ribbon campaign E-mail: NoJunkMailshah@xnet.com \ / --------------------- X against HTML mail TO REPLY, REMOVE NoJunkMail / \ and postings FROM MY E-MAIL ADDRESS. -----------------[DO NOT SEND UNSOLICITED BULK E-MAIL]------------------ I haven't lost my mind, Above opinions are mine only. it's backed up on tape somewhere. Others can have their own.
Mark A - 07 Sep 2006 23:29 GMT > Folks, > [quoted text clipped - 69 lines] > > Thanks for you help. You would be much better off if you commit every 1000 rows instead of 30,000. Many people over-estimate the cost of a commit, and DB2 will write the Log Buffer to disk anyway when the log buffer is full, or every one second, even if you don't take a commit. So putting off the commit for 30,000 inserts does not really help you, and could actually slow things down.
You also should make sure that your LOGBUFSZ is set to at least 128 pages, and maybe a bit larger (default is a pitiful 8 pages). But don't make it too large because it could actually slow things down a bit if more than 512 pages.
With a more frequent commit interval you will not run out of log space and you will improve performance, so the question you posted becomes moot and you can spend your time on more productive matters.
Hemant Shah - 08 Sep 2006 03:59 GMT >> Folks, >> [quoted text clipped - 76 lines] > 30,000 inserts does not really help you, and could actually slow things > down. I commit every 5000 rows, I even tried commit at 1000 rows with -964 error. If I unset DB2_USE_ALTERNATE_PAGE_CLEANING then I do not have problem committing every 5000 rows.
> You also should make sure that your LOGBUFSZ is set to at least 128 pages, > and maybe a bit larger (default is a pitiful 8 pages). But don't make it too > large because it could actually slow things down a bit if more than 512 > pages. LOGBUFSZ is set to 128.
> With a more frequent commit interval you will not run out of log space and > you will improve performance, so the question you posted becomes moot and > you can spend your time on more productive matters. I only run out of log space if DB2_USE_ALTERNATE_PAGE_CLEANING is set to yes.
 Signature Hemant Shah /"\ ASCII ribbon campaign E-mail: NoJunkMailshah@xnet.com \ / --------------------- X against HTML mail TO REPLY, REMOVE NoJunkMail / \ and postings FROM MY E-MAIL ADDRESS. -----------------[DO NOT SEND UNSOLICITED BULK E-MAIL]------------------ I haven't lost my mind, Above opinions are mine only. it's backed up on tape somewhere. Others can have their own.
Hemant Shah - 08 Sep 2006 04:48 GMT > Folks, > [quoted text clipped - 59 lines] > > Thanks for you help. I saw following message several times in my db2diag.log file. I have 4 CPU system and NUM_IOCLEANERS and NUM_IOSERVERS are set to 5, and SOFTMAX is set
2006-09-07-12.35.56.188413-240 E7527178C666 LEVEL: Warning PID : 971142 TID : 1 PROC : db2agent (CFG) 0 INSTANCE: db2prod NODE : 000 DB : CFG APPHDL : 0-18 APPID: *LOCAL.db2prod.060907163548 FUNCTION: DB2 UDB, data protection, sqlpgResSpace, probe:1660 MESSAGE : ADM1822W The active log is being held by dirty pages. This is not an error, but database performance may be impacted. If possible, reduce the database work load. If this problem persists, either decrease the SOFTMAX and/or increase the NUM_IOCLEANERS DB configuration parameters.
2006-09-07-12.35.56.189330-240 E7527845C501 LEVEL: Error PID : 971142 TID : 1 PROC : db2agent (CFG) 0 INSTANCE: db2prod NODE : 000 DB : CFG APPHDL : 0-18 APPID: *LOCAL.db2prod.060907163548 FUNCTION: DB2 UDB, data protection, sqlpgResSpace, probe:2860 MESSAGE : ADM1823E The active log is full and is held by application handle "18". Terminate this application by COMMIT, ROLLBACK or FORCE APPLICATION.
2006-09-07-12.35.56.189587-240 I7528347C466 LEVEL: Error PID : 971142 TID : 1 PROC : db2agent (CFG) 0 INSTANCE: db2prod NODE : 000 DB : CFG APPHDL : 0-18 APPID: *LOCAL.db2prod.060907163548 FUNCTION: DB2 UDB, data protection, sqlpWriteLR, probe:6680 RETCODE : ZRC=0x85100009=-2062548983=SQLP_NOSPACE "Log File has reached its saturation point" DIA8309C Log file was full.
 Signature Hemant Shah /"\ ASCII ribbon campaign E-mail: NoJunkMailshah@xnet.com \ / --------------------- X against HTML mail TO REPLY, REMOVE NoJunkMail / \ and postings FROM MY E-MAIL ADDRESS. -----------------[DO NOT SEND UNSOLICITED BULK E-MAIL]------------------ I haven't lost my mind, Above opinions are mine only. it's backed up on tape somewhere. Others can have their own.
Mark A - 08 Sep 2006 05:49 GMT > I saw following message several times in my db2diag.log file. I have 4 CPU > system and NUM_IOCLEANERS and NUM_IOSERVERS are set to 5, and SOFTMAX is > set I would change the NUM_IOCLEANERS and NUM_IOSERVERS to 4.
The default for SOFTMAX is 100. If it is not set to 100, I would try changing it back to the default.
Make sure MINCOMMIT is 1.
If you are still having problems, then change the DB2_USE_ALTERNATE_PAGE_CLEANING to OFF (db2set) and change CHNGPGS_THRESH to 20 (default is 60) in the database cfg, which will also speed up dirty page cleaning.
What size are your bufferpools? (Select * from syscat.bufferpools).
Hemant Shah - 10 Sep 2006 22:10 GMT >> I saw following message several times in my db2diag.log file. I have 4 CPU >> system and NUM_IOCLEANERS and NUM_IOSERVERS are set to 5, and SOFTMAX is [quoted text clipped - 4 lines] > The default for SOFTMAX is 100. If it is not set to 100, I would try > changing it back to the default. Percent log file reclaimed before soft chckpt (SOFTMAX) = 100
> Make sure MINCOMMIT is 1. Group commit count (MINCOMMIT) = 1
> If you are still having problems, then change the > DB2_USE_ALTERNATE_PAGE_CLEANING to OFF (db2set) and change CHNGPGS_THRESH to I unset DB2_USE_ALTERNATE_PAGE_CLEANING and the problem was gone.
I am trying to find out what DB2_USE_ALTERNATE_PAGE_CLEANING does and why does it run out of log space.
> 20 (default is 60) in the database cfg, which will also speed up dirty page > cleaning. Changed pages threshold (CHNGPGS_THRESH) = 60
I changed it to 20.
> What size are your bufferpools? (Select * from syscat.bufferpools). BPNAME NPAGES PAGESIZE -------------- ----------- ----------- IBMDEFAULTBP 1000 4096 PLANFILEPOOL 16384 32768 TBDSP1POOL 3276 32768 TBDSP2POOL 3276 32768 TBDSP3POOL 3276 32768 TBDSP4POOL 3276 32768 TBDSP5POOL 3276 32768 CFG32KPOOL 200 32768 TBDSPOLPOOL 3276 32768 TBDSPBTHPOOL 3276 32768
One of the table Mentioned above) that was getting -964 error is using TBDSP5POOL bufferpool.
 Signature Hemant Shah /"\ ASCII ribbon campaign E-mail: NoJunkMailshah@xnet.com \ / --------------------- X against HTML mail TO REPLY, REMOVE NoJunkMail / \ and postings FROM MY E-MAIL ADDRESS. -----------------[DO NOT SEND UNSOLICITED BULK E-MAIL]------------------ I haven't lost my mind, Above opinions are mine only. it's backed up on tape somewhere. Others can have their own.
Mark A - 10 Sep 2006 22:35 GMT >>> I saw following message several times in my db2diag.log file. I have 4 >>> CPU [quoted text clipped - 46 lines] > One of the table Mentioned above) that was getting -964 error is using > TBDSP5POOL bufferpool. Basically DB2_USE_ALTERNATE_PAGE_CLEANING starts page cleaning (writing dirty "updated" pages to disk) sooner than the default page cleaning algorithm. Lowering CHNGPGS_THRESH has a similar effect, although not exactly the same formula is used. But setting CHNGPGS_THRESH to 20% should be fine (DB2 will start cleaning pages to disk when 20% of the pages in a bufferpool are dirty).
You have way too many bufferpools, and unless you have a data warehouse, or the row size is too large, you should be using 4K pages for most tablespaces and bufferpools.
Assuming that it will be too difficult to move the tables to new tablespaces, at the very least you need to consolidate the existing 32K bufferpools. If you have an OLTP application you would be better off with one large bufferpool than what you have now.
It is possible that 2 (or 3 at the very most) bufferpools would be optimum, but based on what has been so far I guarantee that you will not be able to figure out the optimum 2-3 bufferpool configuration, so just create 1 large 32K bufferpool of size 39516 pages (the sum of the existing 32K bufferpools). This is a very easy alter bufferpool alter tablespace operation that you can do in a few minutes. Restart DB2 when you are finished.
Assuming that SYSCATSPACE is using the default bufferpool, then I would increase the size to 5000 pages.
Hemant Shah - 11 Sep 2006 16:44 GMT >>>> I saw following message several times in my db2diag.log file. I have 4 >>>> CPU [quoted text clipped - 62 lines] > bufferpools. If you have an OLTP application you would be better off with > one large bufferpool than what you have now. This is an OLTP environment, but I have tables in different tablespaces so that they can be spread across different disks for performance reasons.
Can multiple tablespaces share same bufferpool?
> It is possible that 2 (or 3 at the very most) bufferpools would be optimum, > but based on what has been so far I guarantee that you will not be able to [quoted text clipped - 6 lines] > Assuming that SYSCATSPACE is using the default bufferpool, then I would > increase the size to 5000 pages.
 Signature Hemant Shah /"\ ASCII ribbon campaign E-mail: NoJunkMailshah@xnet.com \ / --------------------- X against HTML mail TO REPLY, REMOVE NoJunkMail / \ and postings FROM MY E-MAIL ADDRESS. -----------------[DO NOT SEND UNSOLICITED BULK E-MAIL]------------------ I haven't lost my mind, Above opinions are mine only. it's backed up on tape somewhere. Others can have their own.
Mark A - 11 Sep 2006 18:02 GMT > This is an OLTP environment, but I have tables in different tablespaces > so > that they can be spread across different disks for performance reasons. > > Can multiple tablespaces share same bufferpool? Yes, multiple tablespaces can share same bufferpool, if the page sizes match.
If you have an OLTP system, put them all in one large 32K bufferpool that is the sum of the all the smaller 32K bufferpools.
For OLTP, it would be preferable if the pages size was 4K (unless the row will not fit in 4K), but I realize that it would be a little bit of work to change it.
Hemant Shah - 12 Sep 2006 15:02 GMT >> This is an OLTP environment, but I have tables in different tablespaces >> so [quoted text clipped - 11 lines] > will not fit in 4K), but I realize that it would be a little bit of work to > change it. Mark,
Thanks for all your help, I will tyr to make the recommended changes, but I am still not clear about the 2 initial problems:
1) Why do I get -964 when DB2_USE_ALTERNATE_PAGE_CLEANING is set to YES.
2) If my last commit was at 30000th row how did the extra rows get commited?
 Signature Hemant Shah /"\ ASCII ribbon campaign E-mail: NoJunkMailshah@xnet.com \ / --------------------- X against HTML mail TO REPLY, REMOVE NoJunkMail / \ and postings FROM MY E-MAIL ADDRESS. -----------------[DO NOT SEND UNSOLICITED BULK E-MAIL]------------------ I haven't lost my mind, Above opinions are mine only. it's backed up on tape somewhere. Others can have their own.
Mark A - 13 Sep 2006 06:19 GMT > Mark, > > Thanks for all your help, I will tyr to make the recommended changes, but > I am still not clear about the 2 initial problems: > > 1) Why do I get -964 when DB2_USE_ALTERNATE_PAGE_CLEANING is set to YES. I don't know for sure, but it probably has something to do with having a small number of 32K pages in each 32K bufferpool. You probably have some indexes and small tables where the etire object fits in one 4K page, and you are wasting a lot of resources.
> 2) If my last commit was at 30000th row how did the extra rows get > commited? I don't remember the exact scenario that you are describing, nor do I wish to revisit that issue. When I see someone using a large number of very poor configuration parameters in DB2, then I think that should be addressed first, and then if everything works after fixing them, I don't worry about exactly whey it did not work previously.
memmerto@yahoo.com - 18 Sep 2006 20:21 GMT Hemant,
> Thanks for all your help, I will tyr to make the recommended changes, but > I am still not clear about the 2 initial problems: > > 1) Why do I get -964 when DB2_USE_ALTERNATE_PAGE_CLEANING is set to YES. Using DB2_USE_ALTERNATE_PAGE_CLEANING (APC) changes how dirty pages are written to disk. Generally, APC is more intelligent about which pages it writes to disk, and is more aggresive in it's actions. However, it is more sensitive to the configuration of the system, and can impact logging in certain cases.
Generally speaking, the page cleaners write dirty (modified) pages to disk after the associated transaction has committed. If you have a lot of pages being modified (such as when you are inserting many new rows into a table), this consumes a lot of log space and increases the amount of pages that need to be written to disk by the cleaners.
With circular logging enabled, once the transaction log fills up, DB2 starts over with the first log file. However, if there are dirty pages in bufferpool that are associated with the transaction from the first log file, DB2 cannot re-use the log file and you will get a -964.
Because you are modifying a lot of pages, and you are committing infrequently (COMMITCOUNT 5000, although DB2 may be committing internally more frequently), the page cleaners are unable to write out the modified pages quick enough. This means that when DB2 attempts to re-use the first log file, it cannot because there are dirty pages associated with that log file and you get a -964 error.
To rectify this, you can do one of the following: 1) Disable APC and use "normal" page cleaners 2) Increase LOGPRIMARY and/or LOGFILSIZ, which will increase the amount of transaction log available to your application 3) Decrease SOFTMAX, which will make the page cleaners write pages to disk more quickly
> 2) If my last commit was at 30000th row how did the extra rows get commited? Take note that the explicit commits done by the utilities (as specified by COMMITCOUNT) are not the only commits that will be done. DB2 will issue internal commits in certain situations.
This is why you get 32,000 rows in the table instead of the 30,000 that you expect. Note that when DB2 commits internally a message is not displayed -- the messages you see are from the utilities doing the explicit commits every 5000 rows.
-- Matt Emmerton
Hemant Shah - 19 Sep 2006 14:52 GMT > Hemant, Matt,
Thanks for the detailed explaination.
>> 2) If my last commit was at 30000th row how did the extra rows get commited? > > Take note that the explicit commits done by the utilities (as specified > by COMMITCOUNT) are not the only commits that will be done. DB2 will > issue internal commits in certain situations. This does not make sense. Why would DB2 commit the data without explicit commit, what happens when a process encounters error and does a roll back (which my application did when it received -964)?
In my case I end up with extra rows that I did not expect. In this case it did not make difference because I can start loading the table again, but it could be disasterous in OLTP environment.
I have never encountered this problem with DB2 before, I have been working with DB2 UDB since V2.
> This is why you get 32,000 rows in the table instead of the 30,000 that > you expect. Note that when DB2 commits internally a message is not [quoted text clipped - 3 lines] > -- > Matt Emmerton
 Signature Hemant Shah /"\ ASCII ribbon campaign E-mail: NoJunkMailshah@xnet.com \ / --------------------- X against HTML mail TO REPLY, REMOVE NoJunkMail / \ and postings FROM MY E-MAIL ADDRESS. -----------------[DO NOT SEND UNSOLICITED BULK E-MAIL]------------------ I haven't lost my mind, Above opinions are mine only. it's backed up on tape somewhere. Others can have their own.
memmerto@yahoo.com - 20 Sep 2006 06:01 GMT > >> 2) If my last commit was at 30000th row how did the extra rows get commited? > > [quoted text clipped - 9 lines] > it did not make difference because I can start loading the table again, > but it could be disasterous in OLTP environment. Please show me the full LOAD command that you are using, and I can explain more.
-- Matt Emmerton
Hemant Shah - 20 Sep 2006 19:08 GMT >> >> 2) If my last commit was at 30000th row how did the extra rows get commited? >> > [quoted text clipped - 12 lines] > Please show me the full LOAD command that you are using, and I can > explain more. I am not using LOAD command. My application loops through a file and reads data in proprietory format, decodes it and runs INSERT command.
It executes COMMIT every 5000th row. If it encounters error, it does ROLLBACK and exits. Last commit was on 30000th row, and it gets -964 after that, application does ROLLBACK and exits. I should not have more than 30000 rows in the table.
Again, this only happens if DB2_USE_ALTERNATE_PAGE_CLEANING is set to YES.
> -- > Matt Emmerton
 Signature Hemant Shah /"\ ASCII ribbon campaign E-mail: NoJunkMailshah@xnet.com \ / --------------------- X against HTML mail TO REPLY, REMOVE NoJunkMail / \ and postings FROM MY E-MAIL ADDRESS. -----------------[DO NOT SEND UNSOLICITED BULK E-MAIL]------------------ I haven't lost my mind, Above opinions are mine only. it's backed up on tape somewhere. Others can have their own.
memmerto@yahoo.com - 27 Sep 2006 16:12 GMT > >> >> 2) If my last commit was at 30000th row how did the extra rows get commited? > >> > [quoted text clipped - 23 lines] > > Again, this only happens if DB2_USE_ALTERNATE_PAGE_CLEANING is set to YES. What type of interface does your application use to talk to DB2? CLI? Static C? JDBC?
-- Matt Emmerton
Hemant Shah - 27 Sep 2006 18:31 GMT >> >> >> 2) If my last commit was at 30000th row how did the extra rows get commited? >> >> > [quoted text clipped - 26 lines] > What type of interface does your application use to talk to DB2? CLI? > Static C? JDBC? Static C (EXEC SQL).
> -- > Matt Emmerton
 Signature Hemant Shah /"\ ASCII ribbon campaign E-mail: NoJunkMailshah@xnet.com \ / --------------------- X against HTML mail TO REPLY, REMOVE NoJunkMail / \ and postings FROM MY E-MAIL ADDRESS. -----------------[DO NOT SEND UNSOLICITED BULK E-MAIL]------------------ I haven't lost my mind, Above opinions are mine only. it's backed up on tape somewhere. Others can have their own.
Bernard Dhooghe - 14 Sep 2006 09:59 GMT In the DB2 online support (http://www-306.ibm.com/software/data/db2/udb/support/) searching on DB2_USE_ALTERNATE_PAGE_CLEANING gives a number of APAR's, one is:
" IY58576: ADM1822W messages in the db2diag.log when using DB2_USE_ALTERNATE_PAGE_CLEANING
Problem has been fixed in V8.2 FP8 (s041221) "
But FP10 looks to be installed.
If the FP is indeed on this machone, seems the problem is still not cured in all cases.
Bernard Dhooghe
> Folks, > [quoted text clipped - 67 lines] > I haven't lost my mind, Above opinions are mine only. > it's backed up on tape somewhere. Others can have their own. Hemant Shah - 14 Sep 2006 18:19 GMT Yes, FP10 is installed on the system but instfix says that APAR is not installed:
# instfix -i -k IY58576 There was no data for IY58576 in the fix database.
> In the DB2 online support > (http://www-306.ibm.com/software/data/db2/udb/support/) searching on [quoted text clipped - 85 lines] >> I haven't lost my mind, Above opinions are mine only. >> it's backed up on tape somewhere. Others can have their own.
 Signature Hemant Shah /"\ ASCII ribbon campaign E-mail: NoJunkMailshah@xnet.com \ / --------------------- X against HTML mail TO REPLY, REMOVE NoJunkMail / \ and postings FROM MY E-MAIL ADDRESS. -----------------[DO NOT SEND UNSOLICITED BULK E-MAIL]------------------ I haven't lost my mind, Above opinions are mine only. it's backed up on tape somewhere. Others can have their own.
Mark A - 14 Sep 2006 19:02 GMT > Yes, FP10 is installed on the system but instfix says that APAR is not > installed: > > # instfix -i -k IY58576 > There was no data for IY58576 in the fix database. As I previously mentioned, if you set CHNGPGS_THRESH to 20% or below, that will do approximately the same thing (speed up page cleaning) as using DB2_USE_ALTERNATE_PAGE_CLEANING.
When setting CHNGPGS_THRESH to 20% that means is that DB2 will start cleaning pages (writing them to disk) as soon as 20% of them are dirty (updated), instead of waiting until 60% of them are dirty (60% is the default).
So forget about DB2_USE_ALTERNATE_PAGE_CLEANING and any bugs it may have.
Hemant Shah - 15 Sep 2006 15:57 GMT >> Yes, FP10 is installed on the system but instfix says that APAR is not >> installed: [quoted text clipped - 12 lines] > > So forget about DB2_USE_ALTERNATE_PAGE_CLEANING and any bugs it may have. I have change CHNGPGS_THRESH to 20%, but if there is a bug in DB2_USE_ALTERNATE_PAGE_CLEANING then I was thinking of opening PMR to get the problem fixed.
Thanks for all you help on this.
 Signature Hemant Shah /"\ ASCII ribbon campaign E-mail: NoJunkMailshah@xnet.com \ / --------------------- X against HTML mail TO REPLY, REMOVE NoJunkMail / \ and postings FROM MY E-MAIL ADDRESS. -----------------[DO NOT SEND UNSOLICITED BULK E-MAIL]------------------ I haven't lost my mind, Above opinions are mine only. it's backed up on tape somewhere. Others can have their own.
|
|
|