Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / DB2 Topics / June 2007

Tip: Looking for answers? Try searching our database.

problem with db2 commit on aix (lock and cuncurrency)

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
m.gelosa@gmail.com - 01 Jun 2007 17:24 GMT
Dear all,

I got a problem on db2 for aix running a high workload messaging
system with more than 5,000,000 of deliveries per day. During high
peak hours it happens frequently that the application,
the application threads, are all waiting for a commit to complete.
This seems a typical case of a
missing or locked shared resource. The lock anyway is not at row or
table level during an SQL operation, but it is at commit time. In
fact, different applications at the same time are affected by
this problem. The db2 is v8.1 fixpak 14 and aix is 5.3 fixpak 05 with
5 processor (power5+ 1.6Ghz)
and 64 GB of memory. While the commit hang the db2 machine has no I/O
wait (no disk usage),
plenty of memory and processing power and plenty of network resources
(that's why I suspect
a lock).

To complete the scenario, the databases (same hw, same sw) are two in
continuos replication. The apply programs are running at the target
database. MIN_COMMIT is 5

The question is which monitoring parameter should I check to
understand what happen to
the db. Is there any operating-system resource that I can check?

As last thing, why the rootvg disks are working a lot? iostat says
that they write 8/10 MB per
second... I suspect something related to authentication due to
continuos connect/disconnect of replication, but I don't understand
what it matters with commit.

Thanks in advance
Matteo Gelosa
Lew - 01 Jun 2007 17:44 GMT
I'm not sure I understand all that you are saying but I would get a
snapshot for locks  and also look in the diag log to see if there are
lock escalations.  Even though you have plenty of memory on the box if
you misconfigure your locklist and maxlocks parms you may still
encounter lock escalations.  As far as the high i/o rates to rootvg ,
typically I see that when a system is paging since most systems have
their paging space in rootvg.  It could be other obscure things like
maybe someone put database temp on rootvg or something like that.

On Jun 1, 12:24 pm, m.gel...@gmail.com wrote:
> Dear all,
>
[quoted text clipped - 30 lines]
> Thanks in advance
> Matteo Gelosa
m.gelosa@gmail.com - 04 Jun 2007 15:16 GMT
Well, the db2diag.log is empty and doesn't report any kind of error.
The suspected lock doesn't seem to be at row or table level. What I'm
sure
of is that the client applications are executing the commit (the CLI
api) and they
are waiting for an answer from the database server.

About paging, the machine is not paging. I tried with filemon but I
couldn't find
any relevant info. Anyway, thanks for your answer. I will try with
snapshot.
About the snapshots I was wondering if there is any COMMIT related
parameter
I can explore in order to solve my load problem.

Tnx
Matteo

> I'm not sure I understand all that you are saying but I would get a
> snapshot for locks  and also look in the diag log to see if there are
[quoted text clipped - 6 lines]
>
> On Jun 1, 12:24 pm, m.gel...@gmail.com wrote:
m.gelosa@gmail.com - 04 Jun 2007 15:45 GMT
To add something: the application status with db2 list application
says
"Commit Active" for around the 400 transactions blocked. The I/O load
of the
log related disks (with iostat) is none... That's why I suspected a
lock
in a different place than rows/tables.

Bye,
Matteo

> I'm not sure I understand all that you are saying but I would get a
> snapshot for locks  and also look in the diag log to see if there are
[quoted text clipped - 4 lines]
> their paging space in rootvg.  It could be other obscure things like
> maybe someone put database temp on rootvg or something like that.
stlhd93@gmail.com - 01 Jun 2007 18:10 GMT
On Jun 1, 1:24 pm, m.gel...@gmail.com wrote:
> Dear all,
>
[quoted text clipped - 30 lines]
> Thanks in advance
> Matteo Gelosa

Are you by chance running SDD version 1.6.1.2 or 1.6.2.0 on this
machine? There is  a bug in these versions that caused the exact same
symptoms on our db2 instance. Our solution was to back level to
1.6.0.8 but i believe 1.6.2.1 has since been released to fix this
issue.

Paul
m.gelosa@gmail.com - 04 Jun 2007 15:18 GMT
On 1 Giu, 19:10, stlh...@gmail.com wrote:
> On Jun 1, 1:24 pm, m.gel...@gmail.com wrote:
>
[quoted text clipped - 42 lines]
>
> - Mostra testo tra virgolette -

Unfortunately, we're not using SDD...

Tnx
Matteo
Ian - 04 Jun 2007 18:09 GMT
> To complete the scenario, the databases (same hw, same sw) are two in
> continuos replication. The apply programs are running at the target
> database. MIN_COMMIT is 5

Have you read the documentation on MINCOMMIT?  Setting it to > 1 means
that transactions must wait for either MINCOMMIT other transactions to
issue a commit, OR 1 second has passed.

Even on very busy systems (> 200 trxn/s), MINCOMMIT=1 is often sufficient.

I would suggest tuning MINCOMMIT way down -- perhaps back to 1 to check
performance. *Maybe* increase it to 2, if absolutely necessary.
m.gelosa@gmail.com - 05 Jun 2007 12:11 GMT
I tried, but the behavior is the same. Anyway, I suspect something
related to replication. When the replication is down (no apply program
running) the db is working well.
As just as I turn on the replication, it starts to have the behavior I
mentioned, by blocking 400 tnxs in a Commit Active state. I googled
for "Commit Active" and I found one message of mine of some years ago
related to db2 replication hang in a "Commit Active" state, solved
with a fixpak downgrade. I don't want to say it's a db2 replication
bug, but I suspect something wrong in the configuration of this area.
Besides, the control database for replication resides is the source
database (so the replication control tables reside on it), that has a
much higher frequence of the hang I experience.
As last info I discovered, even the "get snapshot for locks on <db>"
is blocked until the "commit active" finished,
while other snapshots (bufferpools, applications, ...) work.

Tnx,
Matteo

> m.gel...@gmail.com wrote:
>
[quoted text clipped - 10 lines]
> I would suggest tuning MINCOMMIT way down -- perhaps back to 1 to check
> performance. *Maybe* increase it to 2, if absolutely necessary.
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.