Database Forum / DB2 Topics / September 2006
low overhead select to determine current server load
|
|
Thread rating:  |
natG - 22 Aug 2006 23:59 GMT On a warehouse app, our Java clients [can] constantly load/insert rows into the db. I would like to throttle these inserts (1.5 million rows per hr) from the Java app, based on current 'busy state' of the server. It would be nice if I can get the server CPU utilization (and other key factors) from db2 via jdbc. However, it needs to be low cost, since it would likely run every x seconds. System: db2 v 9.1 64bit on Centos 4.3 64 bit.
Thank you. nat
Knut Stolze - 28 Aug 2006 11:19 GMT > On a warehouse app, our Java clients [can] constantly load/insert rows > into the db. I would like to throttle these inserts (1.5 million rows per > hr) from the Java app, based on current 'busy state' of the server. > It would be nice if I can get the server CPU utilization (and other key > factors) from db2 via jdbc. However, it needs to be low cost, since it > would likely run every x seconds. I don't think that DB2 provides an interface to retrieve that kind of information. But you could resort to a user-defined function (UDF). The function can be called using "VALUES <fct>()", which is as light-weight as it can possible get.
 Signature Knut Stolze DB2 Information Integration Development IBM Germany
natG - 28 Aug 2006 17:57 GMT >> On a warehouse app, our Java clients [can] constantly load/insert rows >> into the db. I would like to throttle these inserts (1.5 million rows per [quoted text clipped - 7 lines] > function can be called using "VALUES <fct>()", which is as light-weight as > it can possible get. hmmm... But I have no idea what the function needs to do. Is there a db2 function that reports server load? Or how many inserts per second its doing *now*, or how many tps its doing.
Thanks nat
Knut Stolze - 28 Aug 2006 23:34 GMT >>> On a warehouse app, our Java clients [can] constantly load/insert rows >>> into the db. I would like to throttle these inserts (1.5 million rows [quoted text clipped - 11 lines] > function that reports server load? Or how many inserts per second its > doing *now*, or how many tps its doing. I thought you were interested in the CPU utilization. For the inserts/transactions per second thing, you could employ the DB2 snapshot and event monitor facilities.
If it is the CPU load you're after, we are really talking about a system call. The question is then which OS platform your DB2 server is running on.
 Signature Knut Stolze DB2 Information Integration Development IBM Germany
natG - 29 Aug 2006 17:35 GMT >>>> On a warehouse app, our Java clients [can] constantly load/insert rows >>>> into the db. I would like to throttle these inserts (1.5 million rows [quoted text clipped - 19 lines] > call. The question is then which OS platform your DB2 server is running > on. Indeed, it's the cpu utilization I'm after. (But if that's too tough or costly to get, inserts /ps or tps will do.) The server is a 64bit dual-core Linux-Centos 4.3 running db2 9.1-64bit. Thank you much. nat
Knut Stolze - 29 Aug 2006 23:51 GMT > Indeed, it's the cpu utilization I'm after. (But if that's too tough or > costly to get, inserts /ps or tps will do.) Getting the CPU load would be _much_ easier than the other metrics you mentioned. The CPU load can be retrieved with a system call; the others require some accounting work to be done by DB2.
> The server is a 64bit dual-core Linux-Centos 4.3 running db2 9.1-64bit. So you're on Linux. I'll see if I can figure out how the CPU load can be collected on that platform and get back to you in a couple days...
 Signature Knut Stolze DB2 Information Integration Development IBM Germany
natG - 30 Aug 2006 00:41 GMT >> Indeed, it's the cpu utilization I'm after. (But if that's too tough or >> costly to get, inserts /ps or tps will do.) [quoted text clipped - 7 lines] > So you're on Linux. I'll see if I can figure out how the CPU load can be > collected on that platform and get back to you in a couple days... Thats very nice of you. Thank you much. nat
Ian - 30 Aug 2006 04:00 GMT >> Indeed, it's the cpu utilization I'm after. (But if that's too tough or >> costly to get, inserts /ps or tps will do.) [quoted text clipped - 7 lines] > So you're on Linux. I'll see if I can figure out how the CPU load can be > collected on that platform and get back to you in a couple days... You can just read /proc/loadavg.
[ianbjor@kwaiken ~]$ cat /proc/loadavg 0.18 0.07 0.02 1/133 20942
The first 3 columns are the 1, 5 and 15 minute load averages.
natG - 30 Aug 2006 19:00 GMT >>> Indeed, it's the cpu utilization I'm after. (But if that's too tough or >>> costly to get, inserts /ps or tps will do.) [quoted text clipped - 14 lines] > > The first 3 columns are the 1, 5 and 15 minute load averages. Aye. Thanks for the tip. Three questions.
1. Is there something more fine-grained, closer to the last few seconds or preferably at the exact moment it was called. 2. How do I wrap it in a UDF in a way that I don't need to echo the cat to a temp file? Writing to a temp file every few seconds will in itself cause overhead. 3. Would you know, if in a dual-processor system, the loadavg is a combination of both processors?
Thank you much. -nat
Knut Stolze - 31 Aug 2006 10:40 GMT > 1. Is there something more fine-grained, closer to the last few seconds or > preferably at the exact moment it was called. That is an operating system thing. So if the OS doesn't provide something more fine-grained, we can't do anything about it.
> 2. How do I wrap it in a UDF in a way that I don't need to echo the cat to > a temp file? Writing to a temp file every few seconds will in itself cause > overhead. /proc/loadavg is a file by itself. You don't need a temp file in the first place. Have a look at my other post that shows how you can access the file and get the information you'd like.
> 3. Would you know, if in a dual-processor system, the loadavg is a > combination of both processors? Yes, it is. And given that DB2 will happily use all processors, I think this would be quite suitable for you?
 Signature Knut Stolze DB2 Information Integration Development IBM Germany
natG - 31 Aug 2006 20:42 GMT >> 1. Is there something more fine-grained, closer to the last few seconds or >> preferably at the exact moment it was called. [quoted text clipped - 15 lines] > Yes, it is. And given that DB2 will happily use all processors, I think > this would be quite suitable for you? You bet. Thank you very much! nat
jacques - 31 Aug 2006 22:18 GMT You can try this as well if you want more up to the second information
CREATE FUNCTION cpuload() RETURNS DOUBLE EXTERNAL NAME 'cpuload!cpuload' LANGUAGE C PARAMETER STYLE SQL SPECIFIC cpuload DETERMINISTIC NO EXTERNAL ACTION NOT FENCED NO FINAL CALL RETURNS NULL ON NULL INPUT ALLOW PARALLEL NO SQL ; #include <sqludf.h> #include <stdio.h> #include <sys/time.h>
#define DELAY 100000
void SQL_API_FN cpuload( SQLUDF_DOUBLE *load, SQLUDF_NULLIND *load_ind, SQLUDF_TRAIL_ARGS) { long delay; struct timeval before,after; *load_ind=0; gettimeofday(&before,(struct timezone *) NULL); usleep(DELAY); gettimeofday(&after,(struct timezone *) NULL); delay=after.tv_usec-before.tv_usec; if (delay<DELAY) *load=0.0f; else *load=100.0-(((float)DELAY*100.0)/(float)delay); }
natG a écrit :
> >> 1. Is there something more fine-grained, closer to the last few seconds or > >> preferably at the exact moment it was called. [quoted text clipped - 17 lines] > You bet. Thank you very much! > nat natG - 01 Sep 2006 17:26 GMT > You can try this as well if you want more up to the second information > [quoted text clipped - 35 lines] > *load=100.0-(((float)DELAY*100.0)/(float)delay); > } Thank you much. Although I don't know 'c' I will try to figure out how the function works. I still need to get a handle as how to compile these. I did not yet dig into KS' function for the lack of understanding the basics of compiling c programs. (I am a Java guy and use Eclipse.) And rather then troubling him to detail his instructions, I figured I'll hack at it over the weekend.
Thanks again nat
> natG a écrit : > [quoted text clipped - 19 lines] >> You bet. Thank you very much! >> nat Knut Stolze - 01 Sep 2006 21:21 GMT > I still need to get a handle as how to compile these. That's actually quite simple. Put the code in a file named "load_udf.c". The you copy the file ~/sqllib/samples/c/bldrtn" in the same directory and make it executable and run it with "load_udf" as parameter.
$ cp ~/sqllib/samples/c/bldrtn . $ chmod 750 bldrtn $ ./bldrtn load_udf
This will create the library and install it in the ~/sqllib/function/ directory. Finally, connect to the database and run the CREATE FUNCTION statement that I put into a comment.
> I did not yet dig into KS' function for the lack of understanding the > basics of compiling c programs. (I am a Java guy and use Eclipse.) And > rather then troubling him to detail his instructions, I figured I'll hack > at it over the weekend. You can also do this in Java, of course. Just open the file and read from it. However, this will be a bit more "heavy weight" because of it being Java. And DB2 may have to start a JVM for the function invocation.
 Signature Knut Stolze DB2 Information Integration Development IBM Germany
natG - 03 Sep 2006 21:01 GMT First off, the good news. It works! Now..
>> I still need to get a handle as how to compile these. > [quoted text clipped - 5 lines] > $ chmod 750 bldrtn > $ ./bldrtn load_udf True. BUT I first had to get gcc!
> This will create the library and install it in the ~/sqllib/function/ > directory. Finally, connect to the database and run the CREATE FUNCTION > statement that I put into a comment. Yes. Went ok. (Had to drop the trailing @ sign though.) Now, calling it as a column FROM a target table, select getCPULoad() FROM MYTABLE, works great; except it repeats for each row. How do I call it so it should get called once per SELECT? Thanks.
>> I did not yet dig into KS' function for the lack of understanding the >> basics of compiling c programs. (I am a Java guy and use Eclipse.) And [quoted text clipped - 4 lines] > it. However, this will be a bit more "heavy weight" because of it being > Java. And DB2 may have to start a JVM for the function invocation. Definitely not the type of job for a jvm. Now you've got me aching for c.<g> But does db2 actually load/unload the internal jvm per call? -nat
Knut Stolze - 05 Sep 2006 03:49 GMT >> This will create the library and install it in the ~/sqllib/function/ >> directory. Finally, connect to the database and run the CREATE FUNCTION >> statement that I put into a comment. > > Yes. Went ok. (Had to drop the trailing @ sign though.) The @ is the statement terminator that I'm usually using.
> Now, calling it as a column FROM a target table, select getCPULoad() FROM > MYTABLE, works great; except it repeats for each row. How do I call it so > it should get called once per SELECT? You need a table with just a single row. For that, you can either use the SYSIBM.SYSDUMMY1 view like this:
SELECT getCPULoad() FROM sysibm.sysdummy1
or you use a VALUES statement:
VALUES getCPULoad()
>> You can also do this in Java, of course. Just open the file and read >> from >> it. However, this will be a bit more "heavy weight" because of it being >> Java. And DB2 may have to start a JVM for the function invocation. > Definitely not the type of job for a jvm. If this is the only Java UDF, I agree with you. If you have other UDFs running, things look a bit different.
> But does db2 actually load/unload the internal jvm per call? No. The JVM is started and kept alive. (I don't know for how long.) In any case, the overhead to start the JVM is not attached to each and every call to your function, which means that this impact amortizes over time.
 Signature Knut Stolze DB2 Information Integration Development IBM Germany
natG - 03 Sep 2006 21:20 GMT > You can try this as well if you want more up to the second information > [quoted text clipped - 35 lines] > *load=100.0-(((float)DELAY*100.0)/(float)delay); > } OK. This works great. But, can you please explain what this code is doing? Also, is the delay millis or nanos? Why am I forcing a sleep of x? Is there an overhead in the loop? Can I benefit if DELAY was passed as a parameter? Thank you much, nat
Knut Stolze - 05 Sep 2006 04:01 GMT >> void SQL_API_FN cpuload( >> SQLUDF_DOUBLE *load, [quoted text clipped - 8 lines] >> gettimeofday(&after,(struct timezone *) NULL); >> delay=after.tv_usec-before.tv_usec; This won't work reliably if there is an overflow and "after.tv_usec" is actually smaller than "before.tv_usec". The result would be a load of 0.0 below. Not good.
>> if (delay<DELAY) >> *load=0.0f; [quoted text clipped - 3 lines] > > OK. This works great. But, can you please explain what this code is doing? It sleeps for at least 100 milliseconds. And then measures how much time has really gone by. If there is some load on the system, it may take longer than those 100 millis. And the percentage is calculated on how much longer it took.
The man page of "usleep" says:
The usleep() function suspends execution of the calling process for (at least) usec microseconds. The sleep may be lengthened slightly by any system activity or by the time spent processing the call or by the granularity of system timers.
I guess Jacques has to explain this, but I don't see how this would work reliably. If the 100 millis are expired and the time slice of the UDF continues, the sleep ends and the calculated load may be close to zero - regardless of what is really going on at the system. If you have a non-interuptable process blocking the CPU, the load can become quite high even if nothing or not much is going on. So the UDF would have to wait longer to continue after the sleep. It really depends on the scheduling of the processor, which is pretty much just happenstance. But maybe I'm just missing something...
> Also, is the delay millis or nanos? Microseconds.
> Why am I forcing a sleep of x? Is there an overhead in the loop? There is no loop.
> Can I benefit if DELAY was passed as a parameter? You could adjust the time window in which the load is measured.
 Signature Knut Stolze DB2 Information Integration Development IBM Germany
natG - 05 Sep 2006 18:40 GMT >>> void SQL_API_FN cpuload( >>> SQLUDF_DOUBLE *load, [quoted text clipped - 54 lines] > > You could adjust the time window in which the load is measured. I was just thinking...Originally I thought that db2 has this info on its own, internally. If I am using a c function anyhow, there must be many of these type of functions already out there. Let me google....oh my! I guess I need not trouble you further! You have really taught me alot. Much more than reading /proc/loadavg. I now understand how to take any c function, compile it, and incorporate it into db2! Thank you much. nat
Knut Stolze - 31 Aug 2006 10:32 GMT >>> Indeed, it's the cpu utilization I'm after. (But if that's too tough or >>> costly to get, inserts /ps or tps will do.) [quoted text clipped - 9 lines] > > You can just read /proc/loadavg. Exactly...
So here is the code to do that: -------------------------------------------------------------------- #include <sqludf.h> #include <stdlib.h> #include <stdio.h> #include <string.h> #include <errno.h>
/* * Function to extract the CPU load for the past 1 minute (Linux-specific).
CREATE FUNCTION getCPULoad() RETURNS DOUBLE EXTERNAL NAME 'load_udf!getCPULoad' LANGUAGE C PARAMETER STYLE SQL SPECIFIC getCPULoad DETERMINISTIC NO EXTERNAL ACTION NOT FENCED NO FINAL CALL RETURNS NULL ON NULL INPUT ALLOW PARALLEL@ */ void SQL_API_FN getCPULoad( SQLUDF_DOUBLE *load, SQLUDF_NULLIND *load_ind, SQLUDF_TRAIL_ARGS) { char str[100] = { '\0' }; char *endPtr = NULL; FILE *f = NULL;
/* initialize output parameters */ *load = 0.0; *load_ind = -1;
/* open file and read its contents */ f = fopen("/proc/loadavg", "r"); if (!f) { memcpy(SQLUDF_STATE, "38L01", SQLUDF_SQLSTATE_LEN); snprintf(SQLUDF_MSGTX, SQLUDF_MSGTEXT_LEN, "Could not open file '/proc/loadavg"); SQLUDF_MSGTX[SQLUDF_MSGTEXT_LEN-1] = '\0'; goto exit; } fread(str, 1, sizeof str, f); if (ferror(f)) { memcpy(SQLUDF_STATE, "38L02", SQLUDF_SQLSTATE_LEN); snprintf(SQLUDF_MSGTX, SQLUDF_MSGTEXT_LEN, "Failure reading file '/proc/loadavg"); SQLUDF_MSGTX[SQLUDF_MSGTEXT_LEN-1] = '\0'; goto exit; }
/* extract the CPU load information */ *load = strtod(str, &endPtr); if (endPtr == str || errno == ERANGE) { memcpy(SQLUDF_STATE, "38L03", SQLUDF_SQLSTATE_LEN); snprintf(SQLUDF_MSGTX, SQLUDF_MSGTEXT_LEN, "Failure extracting CPU load from file '/proc/loadavg"); SQLUDF_MSGTX[SQLUDF_MSGTEXT_LEN-1] = '\0'; goto exit; } *load_ind = 0;
exit: if (f != NULL) { fclose(f); } } --------------------------------------------------------------------
This can be compiled with the "bldrtn" script shipped with the DB2 samples and then the above CREATE FUNCTION statement registers the UDF in the database. Then you just have to call:
VALUES getCPULoad()
and process the output like any other result set.
 Signature Knut Stolze DB2 Information Integration Development IBM Germany
|
|
|