Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / DB2 Topics / April 2007

Tip: Looking for answers? Try searching our database.

IBM 595 dual core CPU DB partitioning question

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Woody Ling - 30 Mar 2007 15:32 GMT
I am starting to config a 64 bits DB2 in IBM 595 AIX box with 2 dual
core CPU and I would like to assigned one 'processor' for one db
partition. Should I config it as a 4 nodes or 2 nodes instances? How
about other setting such as IO cleaner, Default degree etc?
james_dey@hotmail.com - 30 Mar 2007 16:12 GMT
> I am starting to config a 64 bits DB2 in IBM 595 AIX box with 2 dual
> core CPU and I would like to assigned one 'processor' for one db
> partition. Should I config it as a 4 nodes or 2 nodes instances? How
> about other setting such as IO cleaner, Default degree etc?

There's no rules of thumb. It depends on expected volumetrics and the
nature of the database that you're working with.

There's a "configuration advisor" in Control Center, if you get
desperate
Liam Finnie - 30 Mar 2007 17:56 GMT
On Mar 30, 11:12 am, james_...@hotmail.com wrote:

> > I am starting to config a 64 bits DB2 in IBM 595 AIX box with 2 dual
> > core CPU and I would like to assigned one 'processor' for one db
[quoted text clipped - 6 lines]
> There's a "configuration advisor" in Control Center, if you get
> desperate

Hi Woody,

Any particular reason why you want a partitioned instance here?  Each
database partition will use up some fixed resources (CPU and memory) -
the more partitions you have on a machine, the less resources are
available for your actual workload.  As far as machine sizes go, I
probably wouldn't recommend using multiple partitions for a 4-CPU
machine, unless you're planning to scale out and add more machines in
the future.  A single partition can fully utilize all the CPUs on your
machine.

Cheers,
Liam.
Mark A - 30 Mar 2007 18:20 GMT
> Hi Woody,
>
[quoted text clipped - 9 lines]
> Cheers,
> Liam.

Partitioning a table enables query parallelism. If this is a data warehouse
application with a lot of table scans, then it will help performance. If it
is an OLTP application or access is usually returning a small answer set via
index access, then it is probably not a good idea.

The number of CPU's is not a determinant as to whether to partition, but you
generally should have at least one CPU per partition (dual core can count as
2 CPU's). If you have a data warehouse application using a lot of table
scans and the server has one dual core CPU, partitioning the table is
probably a good idea.
Woody Ling - 30 Mar 2007 19:50 GMT
> > Hi Woody,
>
[quoted text clipped - 20 lines]
> scans and the server has one dual core CPU, partitioning the table is
> probably a good idea.

The idea of 1 CPU for 1 partition can be shown by the following
example:

Suppose we have a complex query to select 10000 rows that use total
580s to return result (with 315s CPU time and 332s sort time). When we
turn on the intra-parallel option and use 2 CPU to process the query,
the total time becomes 533s (with 508s CPU times and 493s sort time
[sum of 2 cpu] ). Although the 2 CPU are fully utilize in intra-
parallel partition and the elapsed time is shorter, the efficiency is
not as good as 1 CPU for 1 partition.

Assume there is 30% overhead for using multiple partitions, the
elapsed time become ( 580/2 ) * 1.3 = 377sec because each partition
handles 5000 rows only.

Do you agree with this example?
Mark A - 30 Mar 2007 22:14 GMT
> The idea of 1 CPU for 1 partition can be shown by the following
> example:
[quoted text clipped - 12 lines]
>
> Do you agree with this example?

I don't understanding what you are saying.

Just because you turn on intra-partition parallelism (not needing DPF) does
not mean that the query is actually running in parallel mode. And
intra-partition parallelism is significantly different than inter-partition
parallelism (DPF) where a single table is physically partitioned (within a
single server or across servers). Using DPF to partition and with a query
that does a table scan, the total elapsed time will be reasonably close to
50% of a non-partitioned table (unless it is a very fast query).

With DPF, parallel overhead is not anywhere near 30%, although the overhead
can be noticed if all the partitions are on the same physical node and you
don't have enough bandwidth on your disk sub-system to run both partitions
at full speed on a table scan. Preferably, each partition should have its
own disk controller, and each partition must have its own disks to scale in
a linear (or near linear) fashion on a large table scan.
Woody Ling - 31 Mar 2007 05:51 GMT
> > The idea of 1 CPU for 1 partition can be shown by the following
> > example:
[quoted text clipped - 31 lines]
>
> - Show quoted text -

Yes, I agree with you so that the elasped time for handling the
complex query is:

a. 1 CPU without DPF and intra-parallel OFF is 580s (with 315s CPU
time and 332s sort time).   {figure from IBM}
b. 2 CPU without DPF and intra-parallel ON is 533s (with 508s CPU
times and 493s sort time [sum of 2 cpu] ).  {figure from IBM}
c. 2 CPU with DPF and intra-parallel OFF is 377s (each partition
handles 50% workload and assume 30% overhead for cross join, combining
result etc).

30% overhead for DPF is an assumption. I just want to show you that
even there is overhead for using DPF, it is still much better than
using intra-parallel with same number of CPU.

Think about if we have 4 CPU, which configuration is better?
a. 2 CPU for 1 partition and create total 2 db partitions with intra-
parallel ON
b. 1 CPU for 1 partition and create total 4 db partitions with intra-
paralle OFF
Mark A - 31 Mar 2007 14:28 GMT
> Yes, I agree with you so that the elasped time for handling the
> complex query is:
[quoted text clipped - 16 lines]
> b. 1 CPU for 1 partition and create total 4 db partitions with intra-
> paralle OFF

You are still confused. Let me repeat:

Just because you turn on intra-partition parallelism (without DPF) does not
mean that the query is actually running in parallel mode.
Woody Ling - 31 Mar 2007 18:45 GMT
> > Yes, I agree with you so that the elasped time for handling the
> > complex query is:
[quoted text clipped - 23 lines]
>
> - Show quoted text -

I totally agree with you again.

>From the example, it already showed that the overhead of using intra-
partition parallelism is very large. If intra-parallel makes the query
running in parallel mode, the elasped time should be much shorter (may
be 1/2 of original). In fact, it is not (533s vs 580s). So if I have 2
CPU, I will create 2 db partitions and assign 1 CPU for each partition
and turn off intra-parallel to fully utilize all CPU resource.

Please correct me if I have a wrong concept.
Ian - 02 Apr 2007 08:00 GMT
> Yes, I agree with you so that the elasped time for handling the
> complex query is:
[quoted text clipped - 10 lines]
> even there is overhead for using DPF, it is still much better than
> using intra-parallel with same number of CPU.

Making an assumption of 30% overhead based on a single result is
more than a little misleading.

> Think about if we have 4 CPU, which configuration is better?
> a. 2 CPU for 1 partition and create total 2 db partitions with intra-
> parallel ON
> b. 1 CPU for 1 partition and create total 4 db partitions with intra-
> paralle OFF

The decision of whether to use 1 CPU / partition or 2 CPUs/partition is
totally dependent on your workload.  Certain workloads will benefit from
having more CPU resources available per partition.  Many don't, and the
benefit of having more (smaller) partitions is better.  Best practices
today start with 1 CPU / partition.
Knut Stolze - 02 Apr 2007 10:57 GMT
>> Think about if we have 4 CPU, which configuration is better?
>> a. 2 CPU for 1 partition and create total 2 db partitions with intra-
[quoted text clipped - 7 lines]
> benefit of having more (smaller) partitions is better.  Best practices
> today start with 1 CPU / partition.

Actually, there is a misconception in that one could assign a CPU to a
specific logical partition in DPF.  Instead, all logical partitions on the
same machine share all CPUs and compete for CPU resources.

Signature

Knut Stolze
DB2 z/OS Utilities Development
IBM Germany

Liam Finnie - 02 Apr 2007 13:19 GMT
> >> Think about if we have 4 CPU, which configuration is better?
> >> a. 2 CPU for 1 partition and create total 2 db partitions with intra-
[quoted text clipped - 16 lines]
> DB2 z/OS Utilities Development
> IBM Germany

If you're on DB2 9, you can actually "bind" a partition to a resource
set, by specifying the resource set name in your db2nodes.cfg file.
On AIX, a resource set is composed of one or more processors, along
with some amount of RAM.  You can configure your own resource sets as
needed, or you can use any predefined resource sets.  This is
particularly useful for NUMA machines (not sure if 595's are NUMA-like
or not offhand), but you should also be able to use this on non-NUMA
machines to enable CPU binding.  This will probably require a bit of
fiddling around to get right though - I think there are some kernel
tuneables that need to be configured, and some privileges you need to
grant the instance owning ID - search for "DB2 node configuration
file" in the online books for more details.

Cheers,
Liam.
Woody Ling - 02 Apr 2007 18:47 GMT
> > >> Think about if we have 4 CPU, which configuration is better?
> > >> a. 2 CPU for 1 partition and create total 2 db partitions with intra-
[quoted text clipped - 34 lines]
>
> - Show quoted text -

Thanks. I found the manual of db2 8.2 about how to define resource set
on AIX. URL here:

http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.
udb.doc/start/r0006351.htm


But I have no idea if the agents and subagents processes are also
"bind" by this setting too. May be they are shared by all CPU. On the
other hand, I am afraid I cannot assign one logical CPU of dual core
processor to one node by using resource set which is designed for
physical resource only.

I also agree that we cannot make conclusion base on one simple
example. Although I can make use of DPF, I should also increase memory
and I/O to keep linear scale.

However, I have another question. Should I turn on intra-parallel if I
assign 2 CPU for 1 node?
Liam Finnie - 03 Apr 2007 13:55 GMT
> > > >> Think about if we have 4 CPU, which configuration is better?
> > > >> a. 2 CPU for 1 partition and create total 2 db partitions with intra-
[quoted text clipped - 52 lines]
> However, I have another question. Should I turn on intra-parallel if I
> assign 2 CPU for 1 node?

Hi Woody,

If you have 2 dual-core chips, then you have 4 physical CPUs, and you
should be able to assign single CPUs to resource sets.  If these 4
CPUs have SMT enabled (8 CPU "threads"), then you can't assign each
CPU thread independently.

The way the binding works is that the initial system controller
process is bound to the resource set, and each process that is forked
from that system controller will inherit the same binding, so will use
the same CPU(s).

I'll leave that intra-parallel question for someone else :-)  My naive
approach would be that if you have multiple concurrent applications,
it's not as important to enable intra-parallel, than if you have only
a single application running at a time.

Cheers,
Liam.
Ian - 06 Apr 2007 17:12 GMT
> However, I have another question. Should I turn on intra-parallel if I
> assign 2 CPU for 1 node?

There is no rule for this decision.  It is similar to the decision on
whether you go with a ratio of 1 or 2 CPUs per database partition.  And
as I said earlier, this is very dependent on your workload.

Fortunately, turning INTRA_PARALLEL on or off is a lot easier than
changing the number of partitions in your database, so it's easier to
evaluate. ;-)
Ian - 06 Apr 2007 17:07 GMT
> Actually, there is a misconception in that one could assign a CPU to a
> specific logical partition in DPF.  Instead, all logical partitions on the
> same machine share all CPUs and compete for CPU resources.

Sorry if I wasn't clear.  I was just talking about the ratio of CPU /
database partitions, not trying to imply that a CPU is dedicated.
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.