Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / DB2 Topics / April 2006

Tip: Looking for answers? Try searching our database.

SUM(double_value) is not deterministic?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Artur - 25 Apr 2006 22:26 GMT
Hi,

I'm using predefined SUM(() function on DOUBLE column. Surpisingly the
sum can give different results depending on the order of values in my
table.

Is that correct?

I can explain it, as adding two double values can entail rouding the
result, so the order of adding influence the final result.

Actually I cannot change my select statement to cast double column to
decimal, or to change the type in the database. I'm looking for a new
SUM(double) aggregate function, which internaly will be adding values
using decimal, f.e. decimal(31,15) and which finally return casted
double value.

Here is a code to replicate the issue:
---------------------------------------------------------------
create table table1 (value double)
@
create procedure load()
begin
 declare c int default 0;
 while (c < 100000)
 do

   set c = c + 1;
   insert into table1 values ( cast(rand() as double) );

  end while;
end
@
call load()
@
create table table2 like table1
@
insert into table2 select * from table1 order by value
@
select sum(value) from table1
@
select sum(value) from table2
@
select double(sum(cast(value as decimal(31,15)))) from table1
@
results:
+4,99359445173494E+004
+4,99359445173499E+004

The correct one in my case is:
+4,99359445173498E+004

Regards,
Artur Wronski
Serge Rielau - 26 Apr 2006 01:20 GMT
> Hi,
>
[quoted text clipped - 47 lines]
> The correct one in my case is:
>  +4,99359445173498E+004
I'm no expert in the matter, but I can't say that I'm terribly surprised.

Cheers
Serge

Signature

Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Tonkuma - 26 Apr 2006 02:28 GMT
First, It is well known that the result of sum of floating point
numbers would be different by sequence of summation of numbers, in
mathematical area.
To get more accurate result, usually summed from small numbers to large
numbers.

That might be the reason of the different results of SUM on table1 and
table2.

But, I don't know this means SUM is not deterministic (It might be
depending on DB2 specification(definition of deterministic)).

Second, You tried
select double(sum(cast(value as decimal(31,15)))) from table1
I thought that an error(drop some lowest digits) may be introduced at
the time of converting from float to decimal.
So, this way may not usefull to get correct result.

Last, my question is how do you get correct
one(+4,99359445173498E+004)?
Artur - 26 Apr 2006 08:28 GMT
The problem  is that having two different sessions, one from CLP, one
from CLI application I get different results of the sum running the
same query on the same static table.

Let's say that I have only 3 values, and in my case let's assume the
precision is only 3 digits:
1.14
2.23
7.99

I can do the sum in two ways, that give different final result:

1.14 + 2.23 -> 3.37 + 7.99 -> 11.4
2.23 + 7.99 -> 10.2 + 1.14 ->11.3

And in my case the correct sum is:
1.14 + 2.23 -> 3.37 + 7.99 -> 11.36 -> 11.4
2.23 + 7.99 -> 10.22 + 1.14 -> 11.36 -> 11.4

Correct in my definition means not order depended. I know that in
general it can be difficult to achieve, but the idea is to add values
internally using different type (which can hold the intermediate result
without truncation), or even with predefined, each time the same order
(fe. not to start adding the highest with lowest value).

I wish to have a way to writh my own sum(double) aggregate function to
replace the standard one (my customer is having badly coded report
which stores data in double field, and have no access to the source
code).

-- Artur Wronski
Tonkuma - 27 Apr 2006 02:38 GMT
> I can do the sum in two ways, that give different final result:
>
> 1.14 + 2.23 -> 3.37 + 7.99 -> 11.4
> 2.23 + 7.99 -> 10.2 + 1.14 ->11.3
How did you calculate these expressions?
Especially my question is that why interim result(10.2) in second
expression has only one decimal digit .

Here is an example of calculation of  second expression by SQL. I got
correct interim result and final sum.
SELECT interim
    , interim + 1.14 fimnal
 FROM (VALUES 2.23 + 7.99 ) Q (interim);
--------------------------------------------------------------------

INTERIM FIMNAL
------- -------
 10.22   11.36

 1 record(s) selected.
Tonkuma - 27 Apr 2006 03:32 GMT
I forgot you wrote "precision is only 3 digits".
But, how do you know scale of interim result without knowing data
values boforehand, if you want to make scale of interim result also 3?
If calculated by SQL and you didn't specify precision and scale of
interim result, precisions and scales will be followings.
1.14 + 2.23 -> 3.37 + 7.99 [ -> 11.36] -> 11.3
DEC(3,2) + DEC(3,2) -> DEC(4,2) + DEC(3,2) [ -> DEC(6,2) ] -> DEC(3,1)
2.23 + 7.99 -> 10.22 + 1.14 [ -> 11.36] ->11.3
DEC(3,2) + DEC(3,2) -> DEC(4,2) + DEC(3,2) [ -> DEC(6,2) ] -> DEC(3,1)
To get correct answer with 3 digits, it would be better to use ROUND
for final result.

SELECT interim
    , DEC(ROUND( (interim + DEC(7.99, 3 , 2)), 1), 3, 1) fimnal
 FROM (VALUES DEC(1.14, 3,2) + DEC(2.23, 3 , 2) ) Q (interim);
--------------------------------------------------------------------

INTERIM FIMNAL
------- ------
  3.37   11.4

 1 record(s) selected.

------------------------- Commands Entered -------------------------
SELECT interim
    , DEC(ROUND( (interim + DEC(1.14, 3 , 2)), 1), 3, 1) fimnal
 FROM (VALUES DEC(2.23, 3,2) + DEC(7.99, 3 , 2) ) Q (interim);
--------------------------------------------------------------------

INTERIM FIMNAL
------- ------
 10.22   11.4

 1 record(s) selected.
Mehmet Baserdem - 27 Apr 2006 04:17 GMT
Artur,

When you add the smallest double 0.000000.......000001 to all your
double numbers and then sum up and round, it is giving the right total.

I tried this

select round(sum(myfield+0.0000000000000000000001), 2)  from mytable

You can replace 2 with any number you want.

regards,

Mehmet
Knut Stolze - 28 Apr 2006 10:46 GMT
> Hi,
>
[quoted text clipped - 3 lines]
>
> Is that correct?

There is a very nicely written summary on floating point numbers here:

http://docs.sun.com/source/806-3568/ncg_goldberg.html

Signature

Knut Stolze
DB2 Information Integration Development
IBM Germany

Artur - 28 Apr 2006 23:30 GMT
Thank you guys for you prompt response.

I'm convinced to analyze value ranges in the database and suggest the
database owner to change double to some more precise type.

-- Artur Wronski
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.