Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / January 2004

Tip: Looking for answers? Try searching our database.

Stored fields ordered left to right

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Dawn M. Wolthuis - 27 Dec 2003 02:37 GMT
Marshall agreed with me on something, so I'll dare post one of the many
questions I have related to relational database theory.

In Date's June 2003 paper entitled "What First Normal Form Really Means" he
asks questions about MultiValue systems, which he (and no one else I've
read) abbreviates as "MVS".  I am preparing answers to these questions.
Date writes "...MVS fields are ordered left to right (and so MVS files are
certainly not relations, and the system is certainly not relational)."

I've been puzzled by this for quite a while, just figuring that relational
theorists have this wrong. But the writings seem so sure of this.  I had
thought that the relations in relational database theory were mathematical
relations, but I am beginning to think that might not be the case.  My
masters in mathematics was quite some time ago, so I hauled out some books
and googled a bit and everything I find that is mathematics, rather than
database theory, indicates what I thought about relations -- a relation is a
set of ordered tuples -- right?  What am I missing here?  An element of a
relation would be of the form (a1, a2, ... an) where a1 is an element of S1
etc.  They ARE ORDERED LEFT TO RIGHT.  Am I misunderstanding something or is
there some other mathematical definition of "relation" that is the one on
which relational database theory is based?

Thanks in advance for your help.  --dawn
Joe \ - 27 Dec 2003 03:36 GMT
> Marshall agreed with me on something, so I'll dare post one of the many
> questions I have related to relational database theory.
[quoted text clipped - 17 lines]
> there some other mathematical definition of "relation" that is the one on
> which relational database theory is based?

A tuple is a set of ordered pairs of the form (attribute, value).
Defining an ordering for the attributes would be superfluous.

URL:http://en2.wikipedia.org/wiki/Ordered_pair

You have much to unlearn, Grasshopper.

--
Joe Foster <mailto:jlfoster%40znet.com>     On the cans? <http://www.xenu.net/>
WARNING: I cannot be held responsible for the above        They're   coming  to
because  my cats have  apparently  learned to type.        take me away, ha ha!
Dawn M. Wolthuis - 27 Dec 2003 04:17 GMT
> > Marshall agreed with me on something, so I'll dare post one of the many
> > questions I have related to relational database theory.
[quoted text clipped - 22 lines]
>
>  URL:http://en2.wikipedia.org/wiki/Ordered_pair

So a tuple is NOT ordered?  Why not?  Why even call an unordered set a
tuple?  Tuples imply ordering, right?  They are elements of Set1 x  Set2 x
Set3 x ... Setn.  It is fine with me if you want those sets to be sets of
ordered pairs -- they can be sets of whatever, but the relation is then a
set of tuples (s1, ... sn) where s1 is an element of S1 (and in your def,
that means it would be an ordered pair).

But a RELATION itself is a set of ORDERED TUPLES -- RIGHT?  Else please
point me to a MATHEMATICS definition that allows for relations that are not
ordered.  I'm not finding any such definitions.

> You have much to unlearn, Grasshopper.

Ditto, methinks.  Smiles.  --dawn

> --
> Joe Foster <mailto:jlfoster%40znet.com>     On the cans? <http://www.xenu.net/>
> WARNING: I cannot be held responsible for the above        They're   coming  to
> because  my cats have  apparently  learned to type.        take me away, ha ha!
Jerry Gitomer - 27 Dec 2003 04:58 GMT
>>"Dawn M. Wolthuis" <dwolt@tincat-group.com> wrote in message
>
> <news:bsira5$jlt$1@news.netins.net>...

<snip>

>>>Date writes "...MVS fields are ordered left to right (and so MVS files
>
> are
>
>>>certainly not relations, and the system is certainly not relational)."

    <big snip>

Allow me to play the role of the fool jumping in where angels
fear to tread.....

Two points which may clarify RDBMS implementation (as opposed to
theory).

1.  The relationships are imposed externally to the data in the
form of indexes and/or foreign keys.  The data itself is
unordered.  If I remember correctly Codd specifically stated
that the data was not ordered.  From an implementation point of
view (circa 1970 when the largest mainframes weren't much faster
than my Palm IIIc) this allowed signifiantly better performance
when adding rows that should logically be anyplace other than
the end of the table.

2.  Within a table row the physical order of the columns as
stored on disk need not conform to the logical order of the
columns as specified in the CREATE TABLE statement.  Again
looking at the mainframe computers of 1970 when all but the very
largest had less than 16MB of RAM, the highest capacity disk
drives only had 33MB of storage and there were arcane rules
about floats and integers starting on word boundaries while
short integers and strings could start on any byte in a word it
became desirable to store all of the floats and integers at the
beginning of the physical disk record and the shorts and strings
after them in order not to waste any space in either memory or
on disk.

HTH
Dawn M. Wolthuis - 27 Dec 2003 15:40 GMT
> >>"Dawn M. Wolthuis" <dwolt@tincat-group.com> wrote in message
> >
[quoted text clipped - 39 lines]
>
> HTH

Yes, this is most helpful.  This is PRECISELY my understanding -- that
deciding to remove the ordering from relational tuples is an implementation
issue and not about the logical theory of relations.

I work with relations that are mathematical relations and are therefore
ordered tuples.  The model behind XML documents is also one of ordered
tuples.  So, if you hear of folks who might sometimes spout that their
database model is "more relational" than RDBMS's it sometimes is due to this
particular issue.

Based on this, it sounds like a response to Date that says that mathematical
relations are ORDERED and not unordered tuples so that this particular point
is irrelevant (and, in fact, wrong) would be an accurate response, right?

Thanks a bunch!  --dawn
Jonathan Leffler - 28 Dec 2003 03:26 GMT
>>>"Joe "Nuke Me Xemu" Foster" <joe@bftsi0.UUCP> wrote:
>><snip>
[quoted text clipped - 32 lines]
> this particular point is irrelevant (and, in fact, wrong) would be
> an accurate response, right?

It depends on the premises from which you work.

One of the documented differences between mathematical relations and
relations used in database theory is precisely this one - that the
elements in a tuple of a mathematical relation are ordered (usually
ordered pairs, in fact) but database theory uses unordered tuples,
where each element logically  consists of the combination attribute
name, attribute type and attribute value.  Of course, in a system
without inheritance to complicate matters, the attribute type
associated with a given attribute name is the same for all tuples in
the relation (but the converse is not true).  The difference between
ordered mathematical relations and the unordered database equivalent
is clearly stated in Codd's original (1970) paper, incidentally:

    Accordingly, we propose that users deal, not with relations
    that are domain-ordered, but with relationships which are
    their domain-unordered counterparts.

[Note the implied distinction between what users see and what the
system manages, too.]

Many practical systems store each record (physical analogue of a
tuple) with the fields (the physical analogue of an attribute) stored
in the same order, which makes it easier to locate a given field
within a given record.  And many systems make life still easier by
storing the data for a given field in a constant width, so it is
trivially possible to pre-calculate the offset into the record for a
given attribute value.

To get back to the question - if you change the premises on which your
version of relational theory is based to state that your tuples are
indeed ordered, then of course Date's statements no longer apply.  The
 theory about which he is making statements states that tuples are
unordered.  Both are valid sets of premises, but they are different
sets of premises, and statements made about one are not valid for the
other.

As to which set of premises is better - that is a separate discussion.
 I strongly suspect there are a rather large number of issues that
have to be resolved when you use ordered tuples rather than unordered
tuples.  Most notably, A JOIN B is not the same a B JOIN A under the
ordered scheme - with consequences that need to be considered very
carefully.

And, reverting to the final question again - no, Date's comments are
neither irrelevant nor wrong in the system about which the comments
were made.

Signature

Jonathan Leffler                   #include <disclaimer.h>
Email: jleffler@earthlink.net, jleffler@us.ibm.com
Guardian of DBD::Informix v2003.04 -- http://dbi.perl.org/

Dawn M. Wolthuis - 28 Dec 2003 03:41 GMT
<big snip>
> One of the documented differences between mathematical relations and
> relations used in database theory is precisely this one - that the
> elements in a tuple of a mathematical relation are ordered (usually
> ordered pairs, in fact) but database theory uses unordered tuples,

Be careful not to equate Codd's database theory with all database theory.  I
work with alternative database theories.  The primary one that I work with
has, in the logical model, the concept of ordered tuples, aka mathematica
relations.  In fact, they are relations which are mathematical functions as
well.  It appears to me that relational database theory works with relations
that are not, by definition, mathematical relations while at least one
database theory that actually uses mathematical relations is often accused
by relational theorists of not being relational.  Am I the only one
intrigued by this?

> where each element logically  consists of the combination attribute
> name, attribute type and attribute value.  Of course, in a system
[quoted text clipped - 26 lines]
> sets of premises, and statements made about one are not valid for the
> other.

I agree, so I'm trying to get some common language.  Taking mathematical
vocabulary and then changing it (and in this particular case changing
relations to require that they be unordered when mathematically they are
ordered !!) is more than likely to cause problems when discussing database
theories (as I have seen time and again, which is why I'm trying to align my
language with that of relational theorists EXCEPT THAT I AM NOT WILLING TO
SACRIFICE ACCURATE MATHEMATICS, if we can avoid that)

> As to which set of premises is better - that is a separate discussion.
>   I strongly suspect there are a rather large number of issues that
> have to be resolved when you use ordered tuples rather than unordered
> tuples.  Most notably, A JOIN B is not the same a B JOIN A under the
> ordered scheme - with consequences that need to be considered very
> carefully.

Yes, you are absolutely right.  In fact, the model I work with is more of a
di-graph of data and instead of joins, it uses data trees (much like the
web).  This is accomplished with functions as well.  Instead of using two
different concepts -- sets/relations combined with operators, the model I
work with uses functions in both cases.  All work with computers can be seen
in terms of functions that map one "object" to another, in fact.  Both
functions and objects can be stored in memory, disk or wherever.  [sorry,
I'm digressing]

> And, reverting to the final question again - no, Date's comments are
> neither irrelevant nor wrong in the system about which the comments
> were made.

I can see that he is not using the mathematical term "relation" but a
definition that has evolved from the work of Codd and has redefined relation
so that it means something different in database theory.  Don't you hate it
when that happens!!!???  Now I need to have a term that stands for database
theory that is based on the mathematical definition of relation and
distinguish that from what is currently called relational database theory
that does not match the mathematical definition of relation.  UGH!  Please
help.  --dawn
Adrian Kubala - 09 Jan 2004 21:51 GMT
Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> Yes, you are absolutely right.  In fact, the model I work with is more of a
> di-graph of data and instead of joins, it uses data trees (much like the
> web).  This is accomplished with functions as well.  Instead of using two
> different concepts -- sets/relations combined with operators, the model I
> work with uses functions in both cases.

Functions are one-way mappings. Many relationships in the world work
both ways. It certainly seems useful to distinguish these two kinds of
relationships, which relations + functions does but functions alone does
not.
Dawn M. Wolthuis - 10 Jan 2004 03:18 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> > Yes, you are absolutely right.  In fact, the model I work with is more of a
[quoted text clipped - 7 lines]
> relationships, which relations + functions does but functions alone does
> not.

This must be another issue of definitions because there are no functions in
mathematics that are not relations.  Functions are a particular type of
relation.  Without looking up Codd's redefinition of a function, if he has
one, but just sticking to pure mathematical definitions, a function is
necessary a relation, so your statement could not possibly be true in that
realm. In areas other than mathematics, I'm sure that people can redefine
these concepts any way they wish.

But are we in agreement that using mathematics terminology and working from
the standpoint of mathematics, all functions are, by definition, also
relations?  Thanks.  --dawn
Adrian Kubala - 10 Jan 2004 22:59 GMT
Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
>> Functions are one-way mappings. Many relationships in the world work
>> both ways. It certainly seems useful to distinguish these two kinds of
[quoted text clipped - 4 lines]
> functions in mathematics that are not relations.  Functions are a
> particular type of relation.

I had assumed you were talking about representation. There is clearly
some difference between, i.e. the function y = x and the relation
{<0,0>, <1,1>, <2,2>, ...}. For example, it would be impossible to
enumerate that relation explicitly in memory. On the other hand, some
functions are such that if you express them implicitly as equations it
is harder to solve for some variables than others. That's why I say both
representations have merit, but that for the kinds of relations
represented in a database it's usually simplest to express them
explicitly.

Since you were not talking about relations vs functions in terms of
their representation, I don't understand your original point. All
functions are relations but not all relations are functions, therefore a
function-only database is strictly less expressive than a
fully-relational one with no benefits.

On the other hand, if a database allowed you to describe *some*
relations as functions and took advantage of algebraic reasoning when
creating derived relations from these functions, that would be really
neat. But then you'd basically have a kind of Prolog, right?
Dawn M. Wolthuis - 11 Jan 2004 00:41 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> >> Functions are one-way mappings. Many relationships in the world work
[quoted text clipped - 26 lines]
> creating derived relations from these functions, that would be really
> neat. But then you'd basically have a kind of Prolog, right?

With this discussion, I've been focussed on one specific issue, where the
database model I am using has been taken to task for not employing
relations.  I have no problem stating that it does not 100% follow a
relational database model, however, this one point -- that it does not
employ relations is entirely false.  Everything in the model is a function
and since all functions, by definition, are relations.  Chris Date asks
questions about the MultiValue database model in his papers on 1NF this
summer and he and Pascal are likely correct that nothing they have read is
precise enough to respond to.  The argument again crops up that the
Nelson-Pick/MultiValue model is not based on relations and I intend to state
that because it is based on functions and functions are, by definition,
relations, that it is most definitely based on relations.  Other arguments
that it does not abide by every relational database rule are likely
accurate, but this indictment is not.

So, that's my point -- the MultiValue model is based on the mathematical
definition of relations, and should not be accused of not being "relational"
in that regard, even if it is not based on Codd's relational database model.
Cheers!  --dawn
Adrian Kubala - 11 Jan 2004 02:30 GMT
Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> With this discussion, I've been focussed on one specific issue, where the
> database model I am using has been taken to task for not employing
> relations.  I have no problem stating that it does not 100% follow a
> relational database model, however, this one point -- that it does not
> employ relations is entirely false.

That's like calling a black and white camera a "color camera" because
black and white are colors.
Dawn M. Wolthuis - 11 Jan 2004 03:45 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> > With this discussion, I've been focussed on one specific issue, where the
[quoted text clipped - 5 lines]
> That's like calling a black and white camera a "color camera" because
> black and white are colors.

Most certainly not. First of all, black and white are not typically
considered colors by color professionals, I believe.

I will be the first to say that the Nelson-Pick model does not meet the
criteria of the relational database model.  But it is absolutely the case
(if you accept my analysis that it is based on functions, I think you will
agree) that it is a mathematically relational model, right?  I'm trying to
use terminology that would be agreeable to all and using mathmatical terms
in order to ensure precision seems like the best place to go for definitions
of mathematical terms.

However, I don't want to use emotive language.  I do want to be true to
Codd's interest in being precise.  If using mathematical language that has
been co-opted by various "sides" in some "debate" will trip people up, then
I need to come up with new language that will not trip people's emotional
buttons.

Given that, should I completely avoid the word "relation" when referring to
the mathematics of data modeling?  I have already decided to set the word
"domain" aside since it is a completely abused term.  Relations in
mathematics are quite consistently defined (as are domains, but, ah well),
so if people are willing to put on mathematics "hats" and accept
mathematical definitions, I think I can be precise without redefining these
terms.

But I am curious -- if I prove that a database model is based on
mathematical relations, and from that perspective is a "relational model",
when both you and I would agree (even though IBM's marketing material does
not) that it is not based on THE relational database model as specified by
Codd, so that databases that implement this model should not be called
RDBMS's (for example) -- is that likely to cause relational theorists to
bit-flip (and switch from logical thinking to emotional reactions) and
disregard anything else that is said?

I'm trying to choose my terminology wisely, but if I call a five a five and
it trips emotions, well then what's a girl ta do, eh?  smiles.  --dawn
Adrian Kubala - 11 Jan 2004 22:05 GMT
Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:

>> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
>> > With this discussion, I've been focussed on one specific issue,
[quoted text clipped - 13 lines]
> (if you accept my analysis that it is based on functions, I think you will
> agree) that it is a mathematically relational model, right?

It does not allow you to express *any* relation, therefore it is not
relational, in exactly the same way that a camera which only lets you
take pictures of pink is not a color camera. If your model let you
express mathematical relations instead of "Codd relations", then I would
agree with you, but since there are mathematical relations which your
model cannot express, it's wrong to call it relational.

Especially since it seems your intent in doing so is to imply that it is
just as expressive as the relational model, when in fact it is strictly
less expressive. (Not to imply any value judgement, but simply to convey
the fact that there are relations which your model cannot possibly
express but any relational model can, whereas any function that your
model can express, any relational model can as well.)
Dawn M. Wolthuis - 12 Jan 2004 03:06 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> >
[quoted text clipped - 22 lines]
> agree with you, but since there are mathematical relations which your
> model cannot express, it's wrong to call it relational.

Every relation where each tuple has a unique identifier or candidate key can
also be represented as a function.  Can you give an example of a set of
propositions that can be modeled as relations but not as functions?  These
functions are not limited to 1NF and, as such, typical propositions can be
modeled much more handily than in a relational model.  But if you can
produce an example that can be expressed using relations and not functions
or that is even easier to work with when represented as relations rather
than functions, I am VERY interested.

> Especially since it seems your intent in doing so is to imply that it is
> just as expressive as the relational model, when in fact it is strictly
> less expressive. (Not to imply any value judgement, but simply to convey
> the fact that there are relations which your model cannot possibly
> express but any relational model can, whereas any function that your
> model can express, any relational model can as well.)

Yes, you are right that I intend to state that it is just as expressive, but
no, it does not simply follow from the statement that it uses functions that
it is more expressive.  I will try not to jump steps like that.  I think you
are incorrect in assuming that functions are less expressive for data
modeling purposes than relations, but if I am wrong about that, I definitely
would like to be corrected.  Thanks for your help.  --dawn
Adrian Kubala - 12 Jan 2004 06:35 GMT
Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> Every relation where each tuple has a unique identifier or candidate key can
> also be represented as a function.  Can you give an example of a set of
> propositions that can be modeled as relations but not as functions?

{<1, 1> <1, 2> <2, 1>}
Dawn M. Wolthuis - 12 Jan 2004 10:28 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> > Every relation where each tuple has a unique identifier or candidate key can
> > also be represented as a function.  Can you give an example of a set of
> > propositions that can be modeled as relations but not as functions?
>
> {<1, 1> <1, 2> <2, 1>}

Yeah, thanks Adrian, but I've got the mathematics part down on the
difference between a function and a relation.  I can graph such.  But when
it comes to what we once called data processing using a computer, is it
feasible to have a pure relational implementation (no, I'm not confusing
implementations with the model -- stick with me) using a database without
actually ending up with a function?  In the relational model, how do you
implement this without either a) an identifier tagged on by the database
which could be kept out of the model or b) a candidate key?

The function this would correspond to in the model I work with is one where
either the designer would implement it with a random or sequential key, such
as:

ID: 1
INFO: <1 ,1>

ID: 2
INFO: <1, 2>

ID: 3
INFO: <2, 1>

Or perhaps:

ID: 1
INFO-A: 1
INFO-B: 1

ID: 2
INFO-A: 1
INFO-B: 2

ID: 3
INFO-A: 2
INFO-B:1

You can see the function outright in the above design.
Or the designer could add it without such a key, by making the entire tuple
a "candidate key" by implementing it as

ID:<1,1>
ID:<1,2>
ID<2,1>

In this design, the function maps each of these IDs to the null set.

Whether one chooses to reflect that it is actually a function that gets
implemented in a database or opts to leave out that information and treat
the function logically as a relation (which it also is) by removing any
unique identifier or candidate keys from the implementation (can you really
implement this in an RDBMS without any candidate keys?), another model could
just as accurately reflect the fact that this is a function within the
model.

I might have said that poorly -- did that make sense?  Thanks a bunch -- I
truly do want to understand and be corrected if I have this wrong.  --dawn
In
Adrian Kubala - 12 Jan 2004 22:31 GMT
Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> You can see the function outright in the above design. Or the designer
> could add it without such a key, by making the entire tuple a
[quoted text clipped - 5 lines]
>
> In this design, the function maps each of these IDs to the null set.

In that case, I will grant you that this approach is equally expressive,
but I think that calling this a "function" is an abuse of the term. The
range of this function, and therefore the function itself, in it's
function-ness, is logically meaningless -- you're actually using the
domain as a way to encode a list of tuples, a relation, in an obscure
way.

A modelling tool should deliver a straightforward mapping between the
parts of the model and the parts of the system we're modeling. With
relations it is easy to make such a mapping by treating relations as
logical predicates: "so-and-so LIVES AT such-and-such-a-place". I don't
see how such a mapping would work in general for functions. In our above
example, you can see that the null range of the function corresponds to
nothing in our actual system, it is just there to satisfy the formalism.
That's a bad sign.

> Whether one chooses to reflect that it is actually a function that
> gets implemented in a database or opts to leave out that information
[quoted text clipped - 3 lines]
> candidate keys?), another model could just as accurately reflect the
> fact that this is a function within the model.

So my complaint is that calling relations functions is a bad idea. I
think that if you tried to translate relational formalisms into terms of
equivalent "functions from tuples to the null set" the result would be
more complicated, harder to understand and apply. Which is why
mathematicians created the concept of relations to begin with, so I'm
surprised you disagree.
Dawn M. Wolthuis - 13 Jan 2004 03:34 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> > You can see the function outright in the above design. Or the designer
[quoted text clipped - 37 lines]
> mathematicians created the concept of relations to begin with, so I'm
> surprised you disagree.

I understand your point that a mapping from a bunch of "keys" to empty sets
sounds like a tacky component of a model.  However, NULLS in this
environment can be modeled as null sets and play prominently in the model.

More importantly, I would argue that calling functions relations removes
some of the specificity, making the model more abstract than is necessary.
If you got back to my list of "input, output, processing, and storage" your
basic components are objects and functions -- subjects and predicates --
input to a function, the processing of the function, and the output of the
function, which may or may not be stored.  That's the simple version for
modeling a computer.

We have object languages and function languages, as well as languages that
walk you through operations.

Functions are one type of object and objects are the input of and output
from functions.  What needs to be stored for the purpose of retrieval are
objects and functions (handled as objects).  From these functions & objects,
one can perceive various types of functions as graphs, including trees &
di-graphs.  We can also apply predicate logic to the functions, which are
also predicates where instances are propositions.

Introducing the mathematical concept of a relation really adds nothing to
the mix that cannot be done more simply by viewing everything as an object
or a function.  The model I'm working with includes functions on functions
that are not present in the relational operators.  These include at least a)
navigation by way of a link -- analogous to what is done when one navigates
from a web page clicking on a link and getting to another web page (by the
way, notice that web pages are all functions -- they have a URL mapping to a
page) and b) unnesting and nesting of data, which TTM refers to as GROUP and
UNGROUP, IIRC.

I'm still working this through, so I don't claim to have all of the answers
for what is the right way to do things -- I'm working on describing one way
that has been in production for more than 30 years, with many systems that
are 20 years old still hanging around and moving forward.  This model
accounts for an exceedingly quiet multi-billion dollar industry of players
who have not (yet?) succumbed to the relational model.  The more I learn,
the more convinced I am that they are right not to move to 1NF, at the very
least, and that is only one of the advantages they currently enjoy.

Thanks for helping me make sure that I'm not illogical in the details, even
if my conclusions seem off the mark given the details presented to
ate.  --dawn
Adrian Kubala - 13 Jan 2004 07:09 GMT
Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> More importantly, I would argue that calling functions relations removes
> some of the specificity, making the model more abstract than is necessary.

I'm starting to slightly get a hint of what you mean, and I think we
have covered the logical bases and it sounds like it boils down to a
religious disagreement along very similar lines to imperative versus
functional programmers, so I'm not going to pursue it more. Personally I
like abstraction and I find I make better and easier-for-me-to-
understand models with the right abstract tools but I know not everybody
feels that way. I also think people would be better off if they all
agreed with me, but who doesn't feel that way? :)
Marshall Spight - 13 Jan 2004 05:30 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> > You can see the function outright in the above design. Or the designer
[quoted text clipped - 9 lines]
> In that case, I will grant you that this approach is equally expressive,
> but I think that calling this a "function" is an abuse of the term.

Au contraire; it fits the definition of function perfectly: a mapping from
a set of values to another set of values. In this case, the second set of
values happens to be the empty set.

> The
> range of this function, and therefore the function itself, in it's
> function-ness, is logically meaningless -- you're actually using the
> domain as a way to encode a list of tuples, a relation, in an obscure
> way.

But Dawn didn't pick the example; you did. If the example is wonky,
why not pick a real example, and Dawn can show you how that,
too, is a function, provided you limit yourself to a single key?

Nothing in the definition of function prevents the range of a function
from being the empty set. (Nothing prevents the domain from being
the empty set, either. Neat, eh? Nullology is the most fun part of
relational theory.)

> A modelling tool should deliver a straightforward mapping between the
> parts of the model and the parts of the system we're modeling. With
> relations it is easy to make such a mapping by treating relations as
> logical predicates: "so-and-so LIVES AT such-and-such-a-place". I don't
> see how such a mapping would work in general for functions.

It's the *exact* *same*  *mapping.* The only difference is that you
are limited to a single key.

> In our above
> example, you can see that the null range of the function corresponds to
> nothing in our actual system, it is just there to satisfy the formalism.
> That's a bad sign.

Again, you picked the example. You happened to pick an example
in which every attribute was included in the key. Is this a "bad sign?"
No; that's a totally fine situation to have. Consider: your example
could be a many-to-many table for some other table, and each of
the two columns could be foreign keys. The key would be a composite
key over the two columns. Bad sign? Not at all. It's actually not even
that uncommon.

Marshall
Adrian Kubala - 13 Jan 2004 07:56 GMT
Marshall Spight <mspight@dnai.com> schrieb:
>> A modelling tool should deliver a straightforward mapping between the
>> parts of the model and the parts of the system we're modeling. With
[quoted text clipped - 4 lines]
> It's the *exact* *same*  *mapping.* The only difference is that you
> are limited to a single key.

It can't be the same mapping, as proven by example; the 3-part predicate
"person LIVES AT place DURING time period" has an obvious mapping to the
columns of a relation. To which parts of this predicate do the DOMAIN
and RANGE of a function correspond?
Dawn M. Wolthuis - 13 Jan 2004 14:13 GMT
> Marshall Spight <mspight@dnai.com> schrieb:
> >> A modelling tool should deliver a straightforward mapping between the
[quoted text clipped - 10 lines]
> columns of a relation. To which parts of this predicate do the DOMAIN
> and RANGE of a function correspond?

A predicate would look like this:

person IDENTIFIED BY id LIVES AT place DURING time period

I'll name the function using the plural rather than singular (for reasons I
won't go into and of course there are differences of opinion on this) and
the function is

PEOPLE(id) = { (place, time period) }

Marshall noted that this requires a single value as a candidate key.
Functions can, however, be from a set that includes tuples as well, so even
though it is correct that it is a single value for the key, that value could
be a tuple.

Although with this model there are fewer tables and in particular
relationship tables (as in ERD relationships) are typically lists (nestedt
tables) within an entity file, when it makes sense to have a relationship
table separately the domain would be a set of tuples from the domains of the
entities:

STUDENT-SCHEDULE(student,term) = { (courses, days of week, start time, end
time)* }
using the XML notation of * to indicate a repeating group, i.e. nested
table.

Thank you gentlemen, I think I have the information I need on this one.
Cheers!  --dawn
Adrian Kubala - 13 Jan 2004 20:25 GMT
Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
>> It can't be the same mapping, as proven by example; the 3-part predicate
>> "person LIVES AT place DURING time period" has an obvious mapping to the
[quoted text clipped - 10 lines]
>
> PEOPLE(id) = { (place, time period) }

That's wrong because one person can live at many different places at
different times. So you need to map each person to a list of tuples --
in which case you have hidden the information necessary to make queries
like "who all has every lived at this place" -- or you need to use a
null range, in which case the range of the function is modelling nothing
and is useless baggage.

The point is that in most interesting relations there is no obvious way
to split the predicate into TWO parts where one is more important than
the other; you have N parts which are all equally-important.

I still haven't heard (or overlooked) your explanation of why you think
that whatever rationalization lead mathematicians to invent relations as
a separate concept from functions does not apply just as well to
databases? They're too abstract?
Dawn M. Wolthuis - 14 Jan 2004 00:03 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> >> It can't be the same mapping, as proven by example; the 3-part predicate
[quoted text clipped - 18 lines]
> null range, in which case the range of the function is modelling nothing
> and is useless baggage.

I didn't take into account an entire specification.  A typical
implementation of one person living at different places, not taking into
account past addresses might be modeled as:

PEOPLE(id) = { (place, start date, end date)* }  where date is month/day
only if it is an annual date and is month/day/year or as
PEOPLE(id) = { (place, start date, end date, annual start, annual end)* }

No matter how you cut it, you are mapping something to something else.

> The point is that in most interesting relations there is no obvious way
> to split the predicate into TWO parts where one is more important than
[quoted text clipped - 4 lines]
> a separate concept from functions does not apply just as well to
> databases? They're too abstract?

It is not a matter of being two abstract but the fact that with computers we
are just talking about applying functions (processing) to objects.  If we
want to go into depth on objects that are relations and are not functions
for some particular application, it is fine to define an object that is a
relation, but I have never seen any need for that with retrievable stored
data since by nature of it being accessible, it almost always (?can you
think of cases this is not true) can be easily represented as a function
operating on a key or reference id of some sort to retrieve the data.

While there are many types of propositions that are difficult to encode in
an RDBMS and easy to encode in implementations of the Nelson-Pick model.
These include, but are not limited to, multivalued attributes.  How often to
those implementing a relational model have to think hard about whether to
pull out an attribute because it is possible or likely that in the future
one might want to have more than one of 'these' for this entity.  Even
today, with cell phones having been out for over a decade, I've had someone
tell me on the phone that their system permitted one phone number for home
and one for business, but didn't have a spot for an additional phone number.
This is highly unlikely in a system where cardinality of attribute values
can be changed with the blink of an eye from 1 to a variable (not fixed
number) "many".

So, where I have seen no instances that cannot be modeled equally well with
functions as with relations, I have seen many times when a flexible function
is a much better way to model the propositions.

Take some example propositions:

Hope has a cat named Geneva and a dog named Rugby.
Shanna has no pets, but did have a dog named Monte who died in 2002.

Given only these statements, I might immediately come up with something like
this:

a function named PEOPLE and the assignment of an arbitrary (or sequentially
assigned) id for each person
PEOPLE("12345") = { "Hope", { ("cat", "Geneva", NULL) , ("dog", "Rugby",
NULL)} }
PEOPLE("12346") = ( "Shanna", { ("dog", "Monte", "2002") } }

I've modeled this with a single function and I think I can even leave out
telling you the metadata (although I wouldn't typically) because this so
closely models the way we speak and think about these propositions.  Now
take these same propositions and model them with a relational model.  This
would require at least two relations and the act of splitting up these
propositions into two separate propositions adds to the complexity for both
the developer and unfortunately typically also for the end-user trying to do
queriest, without any noticable gains.

Or am I wrong?  Thanks.  --dawn
Dawn M. Wolthuis - 14 Jan 2004 04:06 GMT
Goodness Gracious -- way too many typos in that previous post of mine since
I did it hastily and didn't proof it before sending.  My apologies, but I
trust that you can make your way through it.  In case you prefer to read a
cleaner version, I've cleaned it up below.  --dawn

> > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> > >> It can't be the same mapping, as proven by example; the 3-part
[quoted text clipped - 22 lines]
> > null range, in which case the range of the function is modelling nothing
> > and is useless baggage.

The following is "cleaned up"

> I didn't take into account an entire specification.  A typical
> implementation of one person living at different places, not taking into
[quoted text clipped - 17 lines]
>
> It is not a matter of being too abstract but the fact that with computers
we
> are just talking about applying functions (processing) to objects.  If we
> want to go into depth on objects that are relations and are not functions,
[quoted text clipped - 7 lines]
> an RDBMS and easy to encode in implementations of the Nelson-Pick model.
> These include, but are not limited to, multivalued attributes.  How often
do
> those implementing a relational model have to think hard about whether to
> pull out an attribute into a separate table because it is possible or
likely that in the future
> one might want to have more than one of 'these' attributes for this
entity.

> Even
> today, with cell phones having been out for over a decade, I've had someone
> tell me on the phone that their system permitted one phone number for home
> and one for business, but didn't have a spot for an additional phone number.
> This is highly unlikely in a system where cardinality of attribute values
> can be changed with the blink of an eye from 1 to a variable number "many"
(not fixed
> number).
>
[quoted text clipped - 19 lines]
> telling you the metadata (although I wouldn't typically) because this so
> closely models the way we speak and think about these propositions.

> Now
> take these same propositions and model them with a relational model.  This
> would require at least two relations.  The act of splitting up these
> propositions each into two separate propositions adds to the complexity
for both
> the developer and, unfortunately, typically for the end-user trying to do
> queries.  This complexity is introduced without sufficient gains, in my
opinion.

> Or am I wrong?  Thanks.  --dawn
Adrian Kubala - 14 Jan 2004 06:53 GMT
Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> So, where I have seen no instances that cannot be modeled equally well
> with functions as with relations, I have seen many times when a
[quoted text clipped - 13 lines]
> NULL)} }
> PEOPLE("12346") = ( "Shanna", { ("dog", "Monte", "2002") } }

This is not modeling, because all you've done is associate some lists
with each person, without any formal way to reason about what the lists
MEAN. I could just as well "model" the first proposition as:

CATS("Geneva") = { {("owner", "Hope")} } etc.
    or even
PEOPLE("12345") = { "has a cat named Geneva and a dog named Rugby" }

Any extra flexibility you get is by delegating more of the
semantics/interpretation to the clients of the database.
Dawn M. Wolthuis - 14 Jan 2004 17:42 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> > So, where I have seen no instances that cannot be modeled equally well
[quoted text clipped - 18 lines]
> with each person, without any formal way to reason about what the lists
> MEAN. I could just as well "model" the first proposition as:

Are you saying it is not modeling because I did not show the logical steps I
took to arrive at this or is it that modeling with a function is necessarily
not a model or what?  I didn't search for a UML to text conversion utility
;-) but I could show this as an object that is a function, has methods, etc.
What would it take for this to be accepted as a model?  --dawn

> CATS("Geneva") = { {("owner", "Hope")} } etc.
> or even
> PEOPLE("12345") = { "has a cat named Geneva and a dog named Rugby" }
>
> Any extra flexibility you get is by delegating more of the
> semantics/interpretation to the clients of the database.
Adrian Kubala - 14 Jan 2004 23:36 GMT
Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
>> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
>> > Hope has a cat named Geneva and a dog named Rugby.
[quoted text clipped - 14 lines]
> steps I took to arrive at this or is it that modeling with a function
> is necessarily not a model or what?

You are not modeling with functions, you are modeling with lists. I can
tell because there was no mention of lists in the original preposition,
and lists are not required to describe functions, but nevertheless lists
have snuck into your model. On the other hand, the above would be a
perfectly good function-based model of "Person 12345 has the list (hope,
(cat, geneva, null), (dog, rugby, null))".

I say it's not modeling because I don't believe you have a general
theory for how prepositions (in this case) can be mapped to and from
lists in a general way, with a useful algebra on lists which preserves
the truth values of prepositions. It is not enough to provide a post-hoc
rationalization for why you chose these particular lists for these
particular examples. But if you do have such a theory I am excited to
hear it.
Dawn M. Wolthuis - 15 Jan 2004 00:14 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> >> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
[quoted text clipped - 22 lines]
> perfectly good function-based model of "Person 12345 has the list (hope,
> (cat, geneva, null), (dog, rugby, null))".

My function is named PERSON and it maps the string "12345" to a set of
strings or string tuples.  I can use more precision in the future to ensure
that you can see that this function provides a model for a plausible
implementation.

> I say it's not modeling because I don't believe you have a general
> theory for how prepositions (in this case) can be mapped to and from
[quoted text clipped - 3 lines]
> particular examples. But if you do have such a theory I am excited to
> hear it.

That is one of the things I have been working on, starting with studying how
the many developers who have worked with this model since 1965 have
"post-hoc" been doing this and pulling out common, repeatable processes that
are used.  Although there is no written document on how to do this (to my
knowledge) there also was no such specification for "secretaries" who set up
filing systems in days before computers -- and yet the job gets done and the
solutions seem to be quite flexible in meeting the needs of a company often
for many years (many instances of  > 20 years of such databases).  It does
seem that experience makes a difference in how well the developer (or
secretary) has been able to set up such solutions.  I'm trying to capture
the knowledge of expert developers who use the Nelsson-Pick model as I
expect such information will also help with the development of XML documents
(which use a model that is very similar).

--dawn
Adrian Kubala - 15 Jan 2004 01:41 GMT
Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> "Adrian Kubala" <adrian@sixfingeredman.net> wrote in message
>> You are not modeling with functions, you are modeling with lists. I can
[quoted text clipped - 8 lines]
> that you can see that this function provides a model for a plausible
> implementation.

I must have failed to communicate my point clearly, because I'm not
contesting whether the function maps a string to a set, but whether it
accurately models the predicate it claims to.

>> I say it's not modeling because I don't believe you have a general
>> theory for how prepositions (in this case) can be mapped to and from
[quoted text clipped - 8 lines]
> "post-hoc" been doing this and pulling out common, repeatable processes that
> are used.

I understand this kind of experience, and its lack is not what I'm
complaining about. If you are programming in OO, it takes experience to
decide which objects to include in your model. If you are creating a
relational database, it takes experience to decide which predicates to
include. In both these systems, once you have done that, there already
exists a theory telling you what it MEANS to be an object or predicate,
and how you can reason about them in a formal way.

> Although there is no written document on how to do this (to my
> knowledge) there also was no such specification for "secretaries" who
> set up filing systems in days before computers -- and yet the job gets
> done and the solutions seem to be quite flexible in meeting the needs
> of a company often for many years (many instances of  > 20 years of
> such databases).

Just because something gets the job done doesn't mean it's a good
model... the Babylonians, for example, could predict solar eclipses
accurately but they did not understand that the earth orbits the sun and
the moon the earth.
Bob Badour - 15 Jan 2004 02:09 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> > So, where I have seen no instances that cannot be modeled equally well
[quoted text clipped - 25 lines]
> Any extra flexibility you get is by delegating more of the
> semantics/interpretation to the clients of the database.

Since the "lists" are unnamed, one wonders how one does anything with them.
I see no flexibility in Dawn's idiocy. Pick lacks flexibility. It is a
chained straightjacket with an anchor attached.
Dawn M. Wolthuis - 15 Jan 2004 02:24 GMT
> > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> > > So, where I have seen no instances that cannot be modeled equally well
[quoted text clipped - 31 lines]
> I see no flexibility in Dawn's idiocy. Pick lacks flexibility. It is a
> chained straightjacket with an anchor attached.

You might take note that I stated that I was leaving out the metadata,
although I don't usually do that.  The attributes are also unnamed in my
message, but not in a full-blown model, just like the attributes that are
associated with each other are also named as a tuple in the actual model.

Just because you see no flexibility and continue to let me know regularly
that I am an idiot (even though your statements likely tell readers more
about you than about me -- you might consider seeing a shrink to find out
why you feel a need to belittle me and others you have never met, in this
fashion), I see no logic in your counter arguement here into which I can
sink my teeth.  I do gather, however, from your comments that you have some
experience with the Nelson-Pick model or an actual PICK implementation.  If
so, please let me know what in your experience has shown inflexibility and
if not, please let me know what information you have that leads you to
believe that PICK is a "straightjacket"....

Thanks.  --dawn
Bob Badour - 12 Jan 2004 22:07 GMT
> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> >
[quoted text clipped - 29 lines]
> express but any relational model can, whereas any function that your
> model can express, any relational model can as well.)

Even if it could express them, it expresses them with greater complexity and
less facility.
Bob Badour - 28 Dec 2003 09:08 GMT
> >>>"Joe "Nuke Me Xemu" Foster" <joe@bftsi0.UUCP> wrote:
> >><snip>
[quoted text clipped - 80 lines]
> neither irrelevant nor wrong in the system about which the comments
> were made.

Mathematical relations do not rely on attribute order. The physical
representation of mathematical relations using written symbols on planar
surfaces conventionally uses attribute order for succinctness.
Marshall Spight - 28 Dec 2003 22:54 GMT
> Mathematical relations do not rely on attribute order. The physical
> representation of mathematical relations using written symbols on planar
> surfaces conventionally uses attribute order for succinctness.

This doesn't seem correct to me.

Consider the union of two sets of ordered pairs:

{ (1,2) }
union
{ (2,1) }

The union of those sets is a set with two elements, yes? How
are we distinguishing between (1,2) and (2,1) if not by order?
Is there some hidden information?

Marshall
Joe \ - 28 Dec 2003 23:35 GMT
> > Mathematical relations do not rely on attribute order. The physical
> > representation of mathematical relations using written symbols on planar
[quoted text clipped - 11 lines]
> are we distinguishing between (1,2) and (2,1) if not by order?
> Is there some hidden information?

What if we assume these n-tuples aren't really ordered, but instead,
the attribute names just so happen to be successive ordinal numbers,
which we just so happen to omit for brevity in Usenet postings?  >=)

--
Joe Foster <mailto:jlfoster%40znet.com>  DC8s in Spaace: <http://www.xenu.net/>
WARNING: I cannot be held responsible for the above        They're   coming  to
because  my cats have  apparently  learned to type.        take me away, ha ha!
Bob Badour - 29 Dec 2003 05:25 GMT
> > Mathematical relations do not rely on attribute order. The physical
> > representation of mathematical relations using written symbols on planar
[quoted text clipped - 10 lines]
> The union of those sets is a set with two elements, yes? How
> are we distinguishing between (1,2) and (2,1) if not by order?

The order is implicit in the physical representation.

> Is there some hidden information?

Yes. The semantics of the pairs. Why is the first one { (1,2) } and not {
(2,1) } ?

The order in which we choose to list the dimensions is entirely arbitrary.
Joe \ - 27 Dec 2003 06:02 GMT
> > A tuple is a set of ordered pairs of the form (attribute, value).
> > Defining an ordering for the attributes would be superfluous.
[quoted text clipped - 11 lines]
> point me to a MATHEMATICS definition that allows for relations that are not
> ordered.  I'm not finding any such definitions.

Yes, a "relation" built from ordered n-tuples would be quite broken!
Consider an ordered 4-tuple, (a, b, c, d), or, { {a}, {a, b}, {a, b, c},
{a, b, c, d} }.  What would happen if any two attributes were *equal*?
A set of ordered pairs, OTOH, wouldn't lose information if any or all of
its ordered pairs collapsed to sets containing just a single element.
Even if {{{a}, {a, a0}}, {{b}, {b, b0}}, {{c}, {c, c0}}, {{d}, {d, d0}}}
collapses to just { {a}, {b}, {c}, {d} } you could still figure out what
was what.  It's a good thing relational tuples aren't ordered n-tuples!

--
Joe Foster <mailto:jlfoster%40znet.com>  Sign the Check! <http://www.xenu.net/>
WARNING: I cannot be held responsible for the above        They're   coming  to
because  my cats have  apparently  learned to type.        take me away, ha ha!
Marshall Spight - 27 Dec 2003 15:32 GMT
> I've been puzzled by this for quite a while, just figuring that relational
> theorists have this wrong. But the writings seem so sure of this.  I had
[quoted text clipped - 4 lines]
> database theory, indicates what I thought about relations -- a relation is a
> set of ordered tuples -- right?  What am I missing here?

I'll give you my perspective, but I don't know how much it'll help.

The question is, how does one distinguish the attributes in the relation?
There are two choices: numerically/positionally, or by name. That is,
one either has a mapping from 1, 2, ... n to attribute, or one has
a mapping from name1, name2, ... namen to attribute. To me, it's
not all that great a difference. It's just a question of what the application
is.

If one is writing a page of equations, the convenience of using
positional identification is high. One is likely working with a single
relation at a time, and it is relatively simple to keep the ordering
straight in the author's and the reader's head.

If one is part of a team building an enormous software system,
then by-name is the better way to go, because of the mnemonic
value of the names. There are likely a lot of relations with a lot
of attributes.

It doesn't affect the semantics of relations or relational operators;
it just affects how attributes are identified.

Marshall
Dawn M. Wolthuis - 27 Dec 2003 16:21 GMT
<snip>
> It doesn't affect the semantics of relations or relational operators;
> it just affects how attributes are identified.
>
> Marshall

OK.  Then a model in which there are relations where the tuples are ordered
but where values can be retrieved either by name or by position would be
different from a relational model where they are not accessible by position.
Then Date's point is not that the MultiValue model is not made up of
mathematical relations, but that it is a different model based on relations
than what Codd and company call the relational model.

Then we are in agreement.  Now I just need words to make it clear that a
particular model is based on mathematical relations (ORDERED tuples) even
though it is not based on all of the other rules imposed within a theory
that names itself the "relational theory" (even though it is based on not
perceiving the relation's tuples as ordered).

The developers of MultiValue databases (such as IBM) promote them as
relational databases and they are mathematically correct.  These databases
stem from the Nelson-Pick data model, however and not from the work of Codd.
If any company is permitted to decide what is relational and what is not, it
would be IBM, I would think, so I'll figure they have as much a right as
Date does to define what is and is not relational.

So, it seems to me we have a problem with our vocabularly.  One group of
database theorists have taken a mathematical theory -- that of relations --
and have named their theory after it, even though they do not stick to the
mathematical definition and they extend the mathematics with many other
rules.  They then tell people working with other models that are equally
mathematically relational that they do not conform to relational database
theory.

To try to straighten this out, I have referred to "the relational database
theory" instead as the RDBMS theory or the SQL database theory, but because
those are both implementation-based terms they do not sit well with
"relational databsae theorists".  Is there a term for the "relational
database theory" that we (at least I) could use to indicate that it is the
relational database theory that does not include all databases that conform
to the mathematical theory of relations?

I want to be able to agree wtih IBM that both DB2 and U2 are relational
databases (since I DO agree with them) and also agree with Date that the U2
databases are not based on his version of a "relational database theory".
Thanks in advance for any help with this vocabulary issue.  --dawn
Bob Badour - 27 Dec 2003 22:39 GMT
> > I've been puzzled by this for quite a while, just figuring that relational
> > theorists have this wrong. But the writings seem so sure of this.  I had
[quoted text clipped - 13 lines]
> not all that great a difference. It's just a question of what the application
> is.

Physical dependence vs. physical independence is not that big a
difference?!?

> If one is writing a page of equations, the convenience of using
> positional identification is high.

You are confusing an external physical representation with a logical
representation.

> It doesn't affect the semantics of relations or relational operators;
> it just affects how attributes are identified.

Huh? Of course it affects the semantics if positional ordering has meaning!
Dawn M. Wolthuis - 27 Dec 2003 22:46 GMT
Do you agree, Bob, that relational database theory seems to require
constructs that are NOT mathematical relations (because they have no logical
ordering among attributes)?

Do you also agree that some of the data models that are not considered by
relational theorists to be relational are, in this way, actually MORE
relational in that they do require logical constructs (ordered tuples) that
correspond to mathematical relations?

This is one of several issues I'm trying to square away or at least get a
vocabulary that is not so contradictory.  Thanks  --dawn
Lauri Pietarinen - 28 Dec 2003 07:56 GMT
> Do you agree, Bob, that relational database theory seems to require
> constructs that are NOT mathematical relations (because they have no logical
[quoted text clipped - 7 lines]
> This is one of several issues I'm trying to square away or at least get a
> vocabulary that is not so contradictory.  Thanks  --dawn

To clear this very issue Date & Darwen define a relation in a
relational database as a set of sets, each set consisting of a set
triplets.  For example, we could have the relation value

Person =
{
  { (person_id, integer, 1), (person_name, string, 'Jill') },
  { (person_id, integer, 2), (person_name, string, 'Joe') }
}

(the second component is there to allow for subtyping)

So, I guess, strictly speaking, you could argue that this is not
really a relation.  However, the distinction is superficial, since
there is a simple mapping from this structure to a mathematical
relation by just mapping each attribute name to an ordinal number,
e.g.

Person_mapping = {(person_id, 1), (person_name, 2)} and simply obtain
the "true" mathematical relation

Person_math =
 { (1, 'Jill'), (2, 'Joe') }

or, if you want to include type information,

Person_math =
 { (1, integer, 'Jill', string ), (2, integer 'Joe', string) }

Given a certain ordering there is a trivial 1:1 mapping from the value
of Person to the value of Person_math.

Since the ordinal position is immaterial for the user (or it should
be) we can assume any order of attributes, and, hence, just forget
about the whole issue.

The conversion from the "Codd" or "D&D" "relation" to the mathematical
relation is (logically) done by the system.  Note, that all relational
operators (UNION, PROJECTION, RESTRICTION, etc..) are expressed very
simply even with these "relational database relations" and the mapping
to the corresponding mathematical operations is trivial.

The reasons (in my view) why Codd got rid of ordinal positions in his
definition of a relation in a relational database were

1) so as not to burden the user with ordinal positions
(e.g. column number 87, instead of, say, column named 'discount')

2) to not imply that the system actually stores the values in a
certain order
(=the system is free to map the columns to physical storage as it
wishes)

However, there is no easy mapping between SQL-tables and mathematical
relations.
Take the SQL-table

SQL-Person

person_id   person_name
---------   -----------
1           Joe
1           Joe

In order to get a mathematical relation out of this we have to number
the duplicates, or include some hidden values in the original table.
However way this is done, it complicates the mapping.

best regards,
Lauri Pietarinen
Bob Badour - 28 Dec 2003 09:04 GMT
> > Do you agree, Bob, that relational database theory seems to require
> > constructs that are NOT mathematical relations (because they have no logical
[quoted text clipped - 7 lines]
> > This is one of several issues I'm trying to square away or at least get a
> > vocabulary that is not so contradictory.  Thanks  --dawn

Apparently, Dawn does not realise she has been in my twit-filter for months
now.

For anyone who doesn't know, ignore Dawn--she's an idiot. Mathematical
relations do not have any particular attribute order. The physical
representation of mathematical relations using written symbols on planar
surfaces relies on physical order for succinctness.
Dawn M. Wolthuis - 28 Dec 2003 14:53 GMT
<snip>
For anyone who doesn't know, ignore Dawn--she's an idiot. Mathematical
> relations do not have any particular attribute order. The physical
> representation of mathematical relations using written symbols on planar
> surfaces relies on physical order for succinctness.

Your opinions about relations are welcome, but please refrain from telling
me again what an idiot I am as I know your opinion already.  If you want
others to join you in thinking I'm an idiot, use logic to indicate the error
of my thinking rather than telling people they ought to ignore me.
Thankfully, I can summon up the self-esteem to continue discussing in this
forum.  I wonder how many people you have bullied or offended enough to
leave OR decided this list was not about rational thinking, but rather
personal attacks.

As for mathematical relations, they ARE sets of ORDERED TUPLES, whether on
paper or in concept.  The ordering is important for mapping to the domains,
however, it is the case as one person pointed out that with database
relations as laid out by Codd, there is a simple mapping from those
relations to mathematical relations.  There is still a vocabulary problem
when working with database models that did not stem from Codd and that DO
have actual mathematical relations in their model.  "Relational Database" is
not a useful designation when both DB2 and U2 have this applied to them.
They are based on two very different models.

As someone once said (googling it indicates it was George Box?) "All models
are flawed, but some are useful".  The "relational database model" has pros
and cons.  The Nelson-Pick model has pros and cons.  The XML model (very
similar to the Nelson-Pick model) has pros and cons. One of the advantages
of the relational model is that it has a wealth of documentation spelling it
out as a logical theory.  Almost everything written about the Nelson-Pick
model is directed to the implementations (PICK) rather than the abstracted
logical model.  I'm attempting to provide more of an abstracted model so
that the pros and cons of the model and not just the implementations can be
discussed.  I am guessing that most on this list are big fans of the
relational database model, while I prefer other models.  I think that is
what leads Bob to his false conclusion.

--dawn
Joe \ - 28 Dec 2003 18:30 GMT
> For anyone who doesn't know, ignore Dawn--she's an idiot. Mathematical
> relations do not have any particular attribute order. The physical
> representation of mathematical relations using written symbols on planar
> surfaces relies on physical order for succinctness.

Heh.  If she thinks the relational tuple vs. ordered n-tuple difference
is insurmountable, she ought to try getting a mathematician and a
theoretical physicist to communicate.  They use different definitions
for so many of the same terms that they'd have to invent an entirely
new language to talk about much of anything besides last night's game!

--
Joe Foster <mailto:jlfoster%40znet.com>  Sacrament R2-45 <http://www.xenu.net/>
WARNING: I cannot be held responsible for the above        They're   coming  to
because  my cats have  apparently  learned to type.        take me away, ha ha!
Dawn M. Wolthuis - 28 Dec 2003 19:06 GMT
<snip>
> Heh.  If she thinks the relational tuple vs. ordered n-tuple difference
> is insurmountable, she ought to try getting a mathematician and a
[quoted text clipped - 6 lines]
> WARNING: I cannot be held responsible for the above        They're   coming  to
> because  my cats have  apparently  learned to type.        take me away, ha ha!

Nope -- nothing insurmountable about it.  It is a matter of language.  I
have been told that a particular model does not include relations because
the tuples are ordered. I am also told that the relational model is based on
mathematical relations.  However, the model I'm trying to describe is based
on mathematical relations and is not, by my calculations, at all based on
the relational model as understood by Codd and company.

So, I need some new language in order to communicate this.  This has nothing
to do with insurmountable differences, but I am hopeful someone can help me
come up with a way to state this that is fair to both models (doesn't make
the Nelson-Pick model sound holier just because it is based on mathematical
relations, nor the Codd model sound better because it is based on a
definiton of relations that it created which has become the database
industry standard language).  Are you able to understand the question I'm
raising?

Once I understood why relational theorists think that mathematical relations
are not relations, I was able to narrow this down to an issue of vocabulary.
Would it sit OK with relational theorists if I refer to their def of
relation as "unordered relations" or "Codd relations"?  I don't want to call
them database relations because I'll be talking about databases that are
using mathematical (ordered) tuples as well.  I'm sure I can make something
up,. but I don't want the language to obscure the information.

--dawn
Marshall Spight - 28 Dec 2003 22:30 GMT
> Nope -- nothing insurmountable about it.  It is a matter of language.

I am sympathetic to your cause, but I am not optimistic about its
chances for success. Still, nothing to lose, eh?

> Would it sit OK with relational theorists if I refer to their def of
> relation as "unordered relations" or "Codd relations"?  I don't want to call
> them database relations because I'll be talking about databases that are
> using mathematical (ordered) tuples as well.  I'm sure I can make something
> up,. but I don't want the language to obscure the information.

Since the current issue under discussion is how attributes are
logically identified, something more like "relations with named
attributes" might be useful. Probably that's too long, and "Codd
relations" will have to do, though.

Maybe it is useful to consider this from the standpoint of the tuple?

Marshall
Bob Badour - 29 Dec 2003 05:12 GMT
> > Nope -- nothing insurmountable about it.  It is a matter of language.
>
[quoted text clipped - 13 lines]
>
> Maybe it is useful to consider this from the standpoint of the tuple?

Dawn is an idiot. I suggest you ignore her.

Relations are relations. She is talking about physical representations of
relations and not about relations themselves. She is too stupid to
understand the difference between a thing and its picture.
Paul G. Brown - 28 Dec 2003 20:30 GMT
> For anyone who doesn't know, ignore Dawn--she's an idiot. Mathematical
> relations do not have any particular attribute order. The physical
> representation of mathematical relations using written symbols on planar
> surfaces relies on physical order for succinctness.

Not quite right, Bob.

Mathematical relations *do* have an attribute order[1] (or else the
term 'mathematical relation' is used in another context entirely: to
refer to relationships between maps [can't find a cite]). One of the
ways in which Codd's relational model distinguishes itself is that it
names relation attributes and thereby does away with the need for
ordering. Some interpretations of the relational model retain the
attribute order property(Datalog, for example[2]) or require the use
of an index offset as an attribute identifier[3].

Mind you, multi-value data management systems don't comply even in
spirit with any of these models.

See:

 [1] http://en.wikipedia.org/wiki/Mathematical_relation

 [2] http://www.cs.buffalo.edu/~chomicki/635/datalog-h.pdf

 [3] Abiteboul et al. _Foundations_of_Databases_ Addison-Wesley
Publishing Company.  1995. (Specifically comments on the 'named' verse
'unnamed' perspectives in Section 3.2)
Joe \ - 27 Dec 2003 22:58 GMT
> > It doesn't affect the semantics of relations or relational operators;
> > it just affects how attributes are identified.
>
> Huh? Of course it affects the semantics if positional ordering has meaning!

SELECT * really ought to rearrange the attributes each time it's
used! >=)   However, alignment can still matter in modern system
architectures.  Intel CPUs might still be happiest when floating
point values are 8-byte-aligned, but this is an implementation,
not a logical, detail.

--
Joe Foster <mailto:jlfoster%40znet.com>  DC8s in Spaace: <http://www.xenu.net/>
WARNING: I cannot be held responsible for the above        They're   coming  to
because  my cats have  apparently  learned to type.        take me away, ha ha!
Marshall Spight - 28 Dec 2003 22:24 GMT
> > The question is, how does one distinguish the attributes in the relation?
> > There are two choices: numerically/positionally, or by name. That is,
[quoted text clipped - 5 lines]
> Physical dependence vs. physical independence is not that big a
> difference?!?

I was speaking of logically distinguishing attributes. I don't
see how the physical level is even relevant here.

> > If one is writing a page of equations, the convenience of using
> > positional identification is high.
>
> You are confusing an external physical representation with a logical
> representation.

Interesting distinction, but not one that I can follow without further
information. Do you have a reference for further reading?

> > It doesn't affect the semantics of relations or relational operators;
> > it just affects how attributes are identified.
>
> Huh? Of course it affects the semantics if positional ordering has meaning!

Mumble. Operations like union, intersection, difference, are identical
either way. Join needs some work, but it's not what I'd call a huge issue.

Marshall
Bob Badour - 29 Dec 2003 05:11 GMT
> > > The question is, how does one distinguish the attributes in the relation?
> > > There are two choices: numerically/positionally, or by name. That is,
[quoted text clipped - 8 lines]
> I was speaking of logically distinguishing attributes. I don't
> see how the physical level is even relevant here.

You don't see how logically distinguishing attributes by physical position
violates physical independence and confuses logical and physical issues?!?

> > > If one is writing a page of equations, the convenience of using
> > > positional identification is high.
[quoted text clipped - 4 lines]
> Interesting distinction, but not one that I can follow without further
> information. Do you have a reference for further reading?

Um, everything that has ever been written on logical data models and the
relational model in particular. What exactly do you not understand? Do you
understand external vs. internal? Do you understand physical vs. logical?
Actually, you don't have to answer the last question because it is clear you
do not.

> > > It doesn't affect the semantics of relations or relational operators;
> > > it just affects how attributes are identified.
[quoted text clipped - 3 lines]
> Mumble. Operations like union, intersection, difference, are identical
> either way. Join needs some work, but it's not what I'd call a huge issue.

No, they are not identical. Consider the following:

R1 = { { A=1, B=2 } }
and
R2 = { { B=2, A=1 } }

What is R3 = R1 union R2?
What is R4 = R1 intersect R2?
What is R5 = R1 minus R2?

If position matters, the answers are:
R3 = { { A=1, B=2 }, { A=2, B=1 } }
R4 = { }
R5 = { { A=1, B=2 } }

If position does not matter, the answers are:
R3 = { { A=1, B=2 } }
R4 = { { A=1, B=2 } }
R5 = { }
Marshall Spight - 29 Dec 2003 17:20 GMT
> > I was speaking of logically distinguishing attributes. I don't
> > see how the physical level is even relevant here.
>
> You don't see how logically distinguishing attributes by physical position
> violates physical independence and confuses logical and physical issues?!?

Again, I don't see the physical level being discussed here. I don't
see that position or order are necessarily physical; they can be
logical. In this case, they are.

> > > You are confusing an external physical representation with a logical
> > > representation.
[quoted text clipped - 4 lines]
> Um, everything that has ever been written on logical data models and the
> relational model in particular.

Your citation lacks a certain hoped-for specificity.

> What exactly do you not understand? Do you
> understand external vs. internal? Do you understand physical vs. logical?
> Actually, you don't have to answer the last question because it is clear you
> do not.

I don't understand why you believe that order is necessarily physical.

> > > > It doesn't affect the semantics of relations or relational operators;
> > > > it just affects how attributes are identified.
[quoted text clipped - 19 lines]
> R4 = { }
> R5 = { { A=1, B=2 } }

That's not correct. If we are to do an apples-to-apples comparison
of relations with named attributes vs. ordered attributes, the
corresponding ordered relation would be:

R1 = { (1,2) }
R2 = { (1,2) }

(I chose attribute A to map to first posititon, and attribute B to map to
second position.)

In which case:

R3 = { (1,2) }
R4 = { (1,2) }
R5 = {}

Which, using A -> first, B -> second, exactly corresponds
to you answers for named attributes:

> If position does not matter, the answers are:
> R3 = { { A=1, B=2 } }
> R4 = { { A=1, B=2 } }
> R5 = { }

As I said, it is simply a question of how one identifies the attributes.

Marshall
Bob Badour - 29 Dec 2003 18:51 GMT
> > > I was speaking of logically distinguishing attributes. I don't
> > > see how the physical level is even relevant here.
[quoted text clipped - 3 lines]
>
> Again, I don't see the physical level being discussed here.

If implicit order is not physical, where does it come from?

> I don't
> see that position or order are necessarily physical; they can be
> logical. In this case, they are.

Nothing is more physical than position ie. location.

> > > > You are confusing an external physical representation with a logical
> > > > representation.
[quoted text clipped - 6 lines]
>
> Your citation lacks a certain hoped-for specificity.

As does your claimed lack of comprehension.

> > What exactly do you not understand? Do you
> > understand external vs. internal? Do you understand physical vs. logical?
> > Actually, you don't have to answer the last question because it is clear you
> > do not.
>
> I don't understand why you believe that order is necessarily physical.

The only logical orders are conventional collating sequences of domains. All
other order is physical. It implies physical location whether absolute or
relative. If not physical, where does the order come from?

> > > > > It doesn't affect the semantics of relations or relational operators;
> > > > > it just affects how attributes are identified.
[quoted text clipped - 23 lines]
> of relations with named attributes vs. ordered attributes, the
> corresponding ordered relation would be:

If order has meaning, stick with the order I gave you for the operations.

> (I chose attribute A to map to first posititon, and attribute B to map to
> second position.)

It is not your choice. I already gave you the order.

> > If position does not matter, the answers are:
> > R3 = { { A=1, B=2 } }
> > R4 = { { A=1, B=2 } }
> > R5 = { }
>
> As I said, it is simply a question of how one identifies the attributes.

You argue for implicit meaning encoded in order. I correctly identified the
attributes in my example, and I showed that if order has meaning the results
differ. Order requires an additional step and greater user knowledge to
achieve correct results, which means the operations differ.

Omitting the semantic identifiers for the attributes only confuses matters.
Dawn M. Wolthuis - 30 Dec 2003 04:12 GMT
> > "Bob Badour" <bbadour@golden.net> wrote in message
> news:Ws6dnWXlao4FKnKiRVn-tA@golden.net...
[quoted text clipped - 10 lines]
>
> If implicit order is not physical, where does it come from?

<snip>

Just so it is clear, Bob, since you are filtering out my responses anyway,
Marshall is correct and you are wrong on this one, darlin'

Ordering of values is logically a function (=operator=map=procedure=method)
from (a subset of) the positive integers to a set of values (and likewise
ordering of attributes is a function from the positive integers to a set of
attributes). One can talk about ordering logically if one can talk about
numbers logically (and most of us can!).

--dawn
Mike Preece - 09 Jan 2004 03:06 GMT
> > > "Bob Badour" <bbadour@golden.net> wrote in message
>  news:Ws6dnWXlao4FKnKiRVn-tA@golden.net...
[quoted text clipped - 23 lines]
>
> --dawn

Well - that's easy for you to say! ;)

I don't get it - but then, I'm not a mathematician. Make it easy for
me, please. Call me an idiot if you like - or let Bob do it. I'd like
to learn though.

If I'm following this discussion correctly, you're saying the word
"relational", in the generally accepted context of "relational
databases", has a different meaning when used in a strictly
mathematical context.

The difference has something to do with "ordering". I don't
understand. Sorry.

Is it important? and if so, why?

Mike.
Jonathan Leffler - 09 Jan 2004 04:39 GMT
> [...various non-illuminating diatribes omitted...]
> If I'm following this discussion correctly, you're saying the word
[quoted text clipped - 6 lines]
>
> Is it important? and if so, why?

Possibly the simplest thing to do is look at the original paper, which
is available online at:

http://www.acm.org/classics/nov95/toc.html

As I pointed out earlier in one of these threads (possibly even this
one), there is a section in this about the difference between ordered
mathematical relations and unordered 'relationships' used in RDBMS,
and why that is important.  My take on it is that the primary issue is
usability - people have a harder time using numbers to identify
columns than using names.

Signature

Jonathan Leffler                   #include <disclaimer.h>
Email: jleffler@earthlink.net, jleffler@us.ibm.com
Guardian of DBD: