I wrote code that parses the db2diag.log to look for errors that would
require us to generate a message to a service center indicating
something is wrong. My problem is trying to figure out what the errors
are. From studying the log, I noticed the codes on the first line after
the date/time are all the same for the same message. For example:
2005-11-17-09.55.31.988098-420 E5537C330 LEVEL: Error (OS)
PID : 21700 TID : 1 PROC : db2pclnr 0
INSTANCE: hdbuser NODE : 000
FUNCTION: DB2 UDB, oper system services, sqloDispatchNBlocks, probe:30
CALLED : OS, -, unspecified_system_function
OSERR : EFAULT (14) "Bad address"
2005-11-17-09.55.31.989606-420 I5868C367 LEVEL: Severe
PID : 21700 TID : 1 PROC : db2pclnr 0
INSTANCE: hdbuser NODE : 000
FUNCTION: DB2 UDB, buffer pool services, sqlbClnrDispatchSomeAIO,
probe:100
MESSAGE : writeStatus =
DATA #1 : Hexdump, 8 bytes
0x2FF21270 : 0000 0000 0000 0000 ........
These entries are from a db2diag.log file on our systems. The first one
has a "code" of C330 and the second a "code" of C367. The C330 and C367
are the same for identical log messages. The only thing that changes is
the date/time and the numbers before the C330 and C367. So, I assumed I
could check for these types of errors. My problem is I can relate that
number to anything I can find in documentation. Are these numbers valid
to check? We already have code to check the sql errors from APIs so I
am not sure I need to parse for those.
I looked at the db2diag tool but it seemed quite cumbersome for what I
needed and is confusing to use in my opinion.
Has anyone done anything like this, can help identify what the codes
are, or have any suggestions? I am on AIX 5.3.
Eugene F - 26 Jan 2006 19:35 GMT
I would suggest you to reconsider in favor to using db2diag because, as
far as I know, most of the codes/keywords/messages in the db2diag.log
are not documented and can be changed by IBM at any time which will
break your parser.
-Eugene
soccertl - 26 Jan 2006 20:22 GMT
If that is the case then I am in trouble no matter what I do since I
need to check for specific errors. If they change them, then my code
has to change to look for the new codes.
I can change to use db2diag so I don't get in trouble with placement of
codes, but that doesn't help me if they actually change. They only
thing I can do in this case if be more generic, which defeats the
purpose. I don't want to send a message for every error or severe level
message in the log. I want to be more precise in what we look at.