Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / September 2007

Tip: Looking for answers? Try searching our database.

Multiple-Attribute Keys and 1NF

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
JOG - 28 Aug 2007 13:26 GMT
I am still fighting with the theoretical underpinning for 1NF. As
such, any comments would be greatfully accepted. The reason for my
concern is that there /seems/ instances where 1NF is insufficient. An
example occurred to me while I was wiring up a dimmer switch (at the
behest of mrs. JOG, to whom there may only be obeyance). Now I don't
know the situation in the US, but in the UK a while back the colour
codes for domestic main circuit wiring changed. Naturally the two
schemes exist in tandem, as exhibited in every house I've had the joy
of doing some DIY in:

Brown -> live.
Red -> live
Blue -> neutral.
Black -> neutral.
Green and yellow -> earth.

The issue with encoding these propositions is that the candidate key
for each proposition may consist of one _or_ two colours. Now I have a
couple of options, none of which seem satisfactory. I could leave
green & yellow as some sort of set-value composite, but obviously this
would affect my querying capabilities, so thats out straight off the
bat. Similarly adding attributes Colour1 and a nullable Colour2 is
simply so hideous it isn't worth consideration. So, I could ungroup to
give me:

Colour  Type
-----------------
Brown   live
Red    live
Black    neutral
Blue    neutral
Green    earth
Yellow    earth
-----------------

But again this is unsatisfactory as I have lost the information that
one wire is green and yellow, but none is brown /and/ red.

I could introduce a surrogate to give me:

Id Colour  Type
-----------------
1 Brown   live
2 Red     live
3 Black     neutral
4 Blue     neutral
5 Green     earth
5 Yellow  earth
-----------------

But this seems wholly artificial given that all the information I
required for identification was available in the original
propositions, and that did not require some artificial id. A [shudder]
non 1NF variation such as:

Id Colour  Type
-----------------
1 Brown   live
2 Red     live
3 Black     neutral
4 Blue     neutral
5 Green,  earth
 Yellow
-----------------

is clearly hideous as it denies the fundamental mathematical principle
that that one attribute should take one value from one domain,
nevermind the fact that it introduces query bias.

I could of course introduce nested relations, but I am uncertain as to
the theoretical consequences of having nested relation as a key (I
guess it would be fine, if adding seemingly unnessecary complexity to
subsequent queries). But moreover it again seems unintuitive, given
that in this case it would indicating that the original propositions
contained, as a value for one of their attributes,  a further
proposition, and this was not the case.

I am having a crisis of faith with the way 1NF is currently viewed.
Any ideas to solve my dilemma? Am I on my own in being perturbed?

Regards, Jim.
Kevin Kirkpatrick - 28 Aug 2007 13:51 GMT
> I am still fighting with the theoretical underpinning for 1NF. As
> such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 77 lines]
>
> Regards, Jim.

Sounds like there are not "wire colors" with a meaning so much as
"wire patterns", with a meaning:

Wire Pattern | Meaning
Brown | live
Red | live
Black | neutral
Blue | neutral
Green and Yellow | earth
Rainbow | telephone

Wire Pattern | color
Brown | Brown
Red | Red
Black | Black
Blue | Blue
Green and Yellow | Green
Green and Yellow | Yellow
Rainbow | Red
Rainbow | Yellow
Rainbow | Blue
Rainbow | Green

Would this clear up the issue?
David Cressey - 28 Aug 2007 14:35 GMT
> I am still fighting with the theoretical underpinning for 1NF. As
> such, any comments would be greatfully accepted. The reason for my
> concern is that there /seems/ instances where 1NF is insufficient.

Insufficient for what?  I wasn't able to infer this from your example.

> An
> example occurred to me while I was wiring up a dimmer switch (at the
[quoted text clipped - 9 lines]
> Black -> neutral.
> Green and yellow -> earth.

In the US,  house current is typically at a nominal 120V,  except for a few
circuits, like stoves that are driven at a nominal 240V.  Nominal 120V can
vary all the way down to 110V.  At some point below that,  "brown out"
begins.

Where the coded meaning of the wires gets to be "interesting" is where you
have an overhead light controlled by a wall switch.  If there are two double
pole switches controlling the same light it gets more interesting.

In general,  the meaning is:

Black  -- live
Red -- live (out of phase with black)
White -- neutral
Green -- ground
bare -- ground.

However,  in many homes, the wire from the appliance to the controlling
switch has been
The stove in my house has  a clock/timer on it that is driven by 120V is
wired with the standard 3 connector wire consisting of a white wire, a black
wire, and a bare wire.

In this case,  the black wire is used to carry (unswitched) power from the
overhead junction box  to the switch.  The companion white wire is used to
carry (switched) power from the switch to the power side of the light
circuit,  which is a black wire.

The above results in a white wire being connected to a black wire.  This
looks "wrong" to a DIY neophyte.  The official code uses bits of colored
tape to indicate such things as "white coded as black",  but that's over my
head.

The electrical wiring in some homes dates back about a century,  before the
wires had colors.  Things get really interesting then.

All of this is a digression from 1NF.   Again,  1NF is insufficient for
what?
JOG - 28 Aug 2007 15:06 GMT
> > I am still fighting with the theoretical underpinning for 1NF. As
> > such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 51 lines]
> The electrical wiring in some homes dates back about a century,  before the
> wires had colors.  Things get really interesting then.

Ah, when men were real men, and wired electrics up with their teeth ;)

> All of this is a digression from 1NF.   Again,  1NF is insufficient for
> what?

Insufficient is probably the wrong word. I'm having trouble finding a
direct 1NF encoding of propositions such as:

Brown -> live.
Red -> live
Blue -> neutral.
Black -> neutral.
Green ^ yellow -> earth.

that doesn't require the addition of a surrogate identifier or the use
of nested relations, neither of which seem to exist in the original
propositions.

Its perhaps a symptom concern with propositions where there are two
components that play the same role - such as a friendship relation:
E.g. if Fred and Barney are friends we have to encode them under
different attribute names in RM, whereas they are actually playing
exactly the same role (of "friend"). Using "Friend1" and "Friend2"
don't seem wholly elegant. If I took Kevin's approach (which I am
still chewing over) for this latter example I'd end up with:

Id Friend
--------------
1 Fred
1 Barney
2 Wilma
2 Betty

again with the use of some sort of surrogate identifier, to assuage
the fact that a friendship (just like a pattern) is identified by more
than one attribute playing equal roles. This addition seems to be
adding complexity where it did not originally exist. (The caveat to
all this being that historically any misgivings with RM I've had have
turned out to be down to a weakness in my grasp of it). J.
Bob Badour - 28 Aug 2007 17:05 GMT
> I am still fighting with the theoretical underpinning for 1NF. As
> such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 77 lines]
>
> Regards, Jim.

There is one obvious way to represent that in 1NF:

Create a color domain where a single value represents green and yellow,
another value represents green, and a third represents yellow etc. The
domain could even represent thick green/thin yellow as a separate value
from thick yellow/thin green if one chooses.

Regardless whether one creates only the domain or also uses it as a
candidate key for some sort of lookup table, the resulting relation is
simply:

Colour     Type
=======    -------
...

Your ID above is one example of such a domain. However, the domain need
not be numeric or have any external numeric representations. It need
only exist with a distinct value for green and yellow.
JOG - 28 Aug 2007 17:22 GMT
> > I am still fighting with the theoretical underpinning for 1NF. As
> > such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 96 lines]
> not be numeric or have any external numeric representations. It need
> only exist with a distinct value for green and yellow.

Well, practically, the surrogate key is the way that I would go. My
question is rather whether this corresponds naturally to the original
propositions, which don't require a new domain in order to be
manipulated in FOL.  I seem to remember a while back there was a
discussion involving Marshall and a few others considering situations
where a nested relation was /necessary/ (I need to have a dig for it),
and it didn't sit comfortably then.
paul c - 28 Aug 2007 17:37 GMT
> ... I seem to remember a while back there was a
> discussion involving Marshall and a few others considering situations
> where a nested relation was /necessary/ (I need to have a dig for it),
> and it didn't sit comfortably then.
> ...

My old favourite is the relation that shows combinations, say with two
attributes that make a composite key.  It's a bit obscure maybe, but an
example query is "combinations of parts that have ever been shipped".  I
think the only way to answer it *verbatim* is to group {shipment#,
part#} on part# and then project away shipment#.  I seem to recall that
Date uses a "catalog" example involving different sets of candidate keys
for a single relation.

p
Bob Badour - 28 Aug 2007 17:43 GMT
>>>I am still fighting with the theoretical underpinning for 1NF. As
>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 101 lines]
> propositions, which don't require a new domain in order to be
> manipulated in FOL.

You assume a color domain so imagining a different color domain changes
the design without adding anything new.

  I seem to remember a while back there was a
> discussion involving Marshall and a few others considering situations
> where a nested relation was /necessary/ (I need to have a dig for it),
> and it didn't sit comfortably then.

I would argue that nested relations are never necessary; although, they
are certainly handy at times. I would choose the discipline of base
relations having no nested relations. In fact, the princicple of
cautious design suggests--as tools evolve toward nested relations--to
allow them only in derived relations.
JOG - 28 Aug 2007 18:34 GMT
> >>>I am still fighting with the theoretical underpinning for 1NF. As
> >>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 104 lines]
> You assume a color domain so imagining a different color domain changes
> the design without adding anything new.

Okay, you're right - not a new domain, just a different one. If I had
started with domain of all colours C (clearly containing the colour
"grey" given the presence of the u there), I read you as proposing
that it be replaced with a labelled powerset of C. Howwwever, would
occams razor not suggest that we should prefer a domain made up of
atomic individuals, as opposed to aliased sets, which will require an
extra step to decompose?

I guess my question is heading towards what is theoretically wrong
about having:

wires = {
{ (Colour, green), (Colour,yellow), (Type, earth) },
{ (Colour, black), (Type, neutral) }
}

as opposed to:

wires = {
{ (Pattern, greenAndYellow), (Type, earth) },
{ (Pattern, solidBlack), (Type, neutral) }
}

patterns = {
{ (Pattern, greenAndYellow), (Contains, green) },
{ (Pattern, greenAndYellow), (Contains, yellow) },
{ (Pattern, solidBlack), (Contains, black) }
}

The first version seems so much impler, and while it doesn't accord to
the traditional view of 1NF, I am unclear as to how it would harm
manipulation.

>    I seem to remember a while back there was a
>
[quoted text clipped - 7 lines]
> cautious design suggests--as tools evolve toward nested relations--to
> allow them only in derived relations.

I would agree with this.
Bob Badour - 28 Aug 2007 19:41 GMT
>>>>>I am still fighting with the theoretical underpinning for 1NF. As
>>>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 113 lines]
> atomic individuals, as opposed to aliased sets, which will require an
> extra step to decompose?

I don't recall suggesting anything about sets--just a domain that has a
distinct value that means "green and yellow".
JOG - 28 Aug 2007 19:53 GMT
> >>>>>I am still fighting with the theoretical underpinning for 1NF. As
> >>>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 116 lines]
> I don't recall suggesting anything about sets--just a domain that has a
> distinct value that means "green and yellow".

Okay, sure. But then to be able to query for green and yellow
individually one must employ a further relation encoding two more
propositions that state "'Green and yellow' contains 'Green'" and that
"'Green and yellow' contains 'Yellow'" respectively. One then has a
schema with two domains - one for the composites and one for
individual colours (which is what I was inferring when I initially
said a new one was being added).
paul c - 28 Aug 2007 20:05 GMT
...
>> I don't recall suggesting anything about sets--just a domain that has a
>> distinct value that means "green and yellow".
[quoted text clipped - 6 lines]
> individual colours (which is what I was inferring when I initially
> said a new one was being added).

I took Bob B to mean something else, eg., allowing the colour purple
doesn't require an app to record what primary colours purple is usually
fashioned from (maybe they're green and blue, I forget, but which they
are doesn't change the point).

p
Bob Badour - 28 Aug 2007 20:23 GMT
>>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
>>>>>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 124 lines]
> individual colours (which is what I was inferring when I initially
> said a new one was being added).

Assuming one has a need to query for green separately, I suppose one can
define an operator on the domain to that effect. If one invents a
requirement that requires a second domain, then one will need a second
domain regardless.
JOG - 28 Aug 2007 23:43 GMT
> >>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
> >>>>>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 129 lines]
> requirement that requires a second domain, then one will need a second
> domain regardless.

Well that sort of brings us full circle back to to my query as to
whether a structure that doesn't require that second domain, such as a
set where elements themselves are pure mathematical relations
containing attribute/value pairs:

Wires = { {(Color, Yellow), (Color, Green), (Type, earth)}} {(Color,
blue), (Type, live)} }

has any negative theoretical impacts. I can see immediately that this
would affect WHERE and ON clauses in the algebra, and one would get
more use out of an GROUP/UNGROUP statements, but I see nothing
inherently /bad/.

Yet that is.
Bob Badour - 29 Aug 2007 00:12 GMT
>>>>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
>>>>>>>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 132 lines]
> Well that sort of brings us full circle back to to my query as to
> whether a structure that doesn't require that second domain,

Let me be clear: Unless you invent a requirement that requires a second
domain, no second domain is required. If one invents such a requirement,
one is.

such as a
> set where elements themselves are pure mathematical relations
> containing attribute/value pairs:
>
> Wires = { {(Color, Yellow), (Color, Green), (Type, earth)}} {(Color,
> blue), (Type, live)} }

But a set is a second domain. You have 1) colors and 2) sets of colors.
Actually, you have sets of some supertype of color and type?!? Yuck!

> has any negative theoretical impacts. I can see immediately that this
> would affect WHERE and ON clauses in the algebra, and one would get
> more use out of an GROUP/UNGROUP statements, but I see nothing
> inherently /bad/.
>
> Yet that is.

Supposing you have a requirement that you must be able to use the green
and the yellow separately and supposing you choose to use a set of
values (i.e. an RVA), you have already identified the problem that when
one ungroups, one loses the information that green and yellow belong as
a pair.

To preserve this information, the dbms would have to have some facility
to generate an artificial identifier for the pair. However, if one
normalized the base relations, one would already have the identifier,
and it would be rather simple to construct the RVA in a derived relation.
JOG - 29 Aug 2007 00:24 GMT
> >>>>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
> >>>>>>>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 147 lines]
> But a set is a second domain. You have 1) colors and 2) sets of colors.
> Actually, you have sets of some supertype of color and type?!? Yuck!

I'm worrying that you have misinterpreted what I have sketched out
there. I haven't specified any domain sets at all in the  above - it
is just a set of propositions, and as with RM, each element is a
mapping from attribute names onto values, that's all. I have no idea
why you think I have supertypes, etc, in there. (which I agree would
be yuck)

> > has any negative theoretical impacts. I can see immediately that this
> > would affect WHERE and ON clauses in the algebra, and one would get
[quoted text clipped - 13 lines]
> normalized the base relations, one would already have the identifier,
> and it would be rather simple to construct the RVA in a derived relation.

Much food for thought. Thanks for the responses Bob.
Bob Badour - 29 Aug 2007 00:42 GMT
>>>>>>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
>>>>>>>>>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 154 lines]
> why you think I have supertypes, etc, in there. (which I agree would
> be yuck)

Based on this set: {(Color, Yellow), (Color, Green), (Type, earth)}

That is not a tuple. A tuple would be:

{(Color, {Yellow, Green}), (Type, earth)}

The names must be unique within a tuple.

>>>has any negative theoretical impacts. I can see immediately that this
>>>would affect WHERE and ON clauses in the algebra, and one would get
[quoted text clipped - 15 lines]
>
> Much food for thought. Thanks for the responses Bob.

You are very welcome. Some of this stuff seems obvious to me now, but at
one time it was anything but.
JOG - 29 Aug 2007 01:05 GMT
> >>>>>>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
> >>>>>>>>>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 156 lines]
>
> Based on this set: {(Color, Yellow), (Color, Green), (Type, earth)}

Still not seeing where you get supertypes from. I just see a mapping
of roles in a proposition to corresponding values.

> That is not a tuple. A tuple would be:
>
> {(Color, {Yellow, Green}), (Type, earth)}

Yes, I realize it is not a db-tuple, because if one relaxes 1NF then
one doesn't have a db-relation at all. That set-valued element still
represents a proposition however, and is in fact a relation in the
true mathematical sense. I find this representation interesting
because a JOIN becomes a union of these elements, and a natural join
is generated by default as one would expect.

> The names must be unique within a tuple.
>
[quoted text clipped - 20 lines]
> You are very welcome. Some of this stuff seems obvious to me now, but at
> one time it was anything but.
JOG - 29 Aug 2007 01:11 GMT
> > >>>>>>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
> > >>>>>>>>>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 170 lines]
> because a JOIN becomes a union of these elements, and a natural join
> is generated by default as one would expect.

or perhaps I am doing the opposite and keeping 1NF and relaxing the
use of finite partial functions to represent tuples. I find the
definition of 1NF to be a pretty nebulous beast. The "NFNF" mob for
example seem to produce relations with set-values which seem entirely
in 1NF to me.

> > The names must be unique within a tuple.
>
[quoted text clipped - 20 lines]
> > You are very welcome. Some of this stuff seems obvious to me now, but at
> > one time it was anything but.
David Cressey - 29 Aug 2007 06:23 GMT
> or perhaps I am doing the opposite and keeping 1NF and relaxing the
> use of finite partial functions to represent tuples. I find the
> definition of 1NF to be a pretty nebulous beast. The "NFNF" mob for
> example seem to produce relations with set-values which seem entirely
> in 1NF to me.

If you model your data using relations,  and if you accept the proposition
that all relations are inherently in 1NF,  then the definition of 1NF
becomes moot, for your purposes.  Maybe that's why it's so nebulous.

I still work with the older definition of 1NF,  and I model my data into SQL
tables rather than relations.  Given this starting place,  the question "is
the table under discussion in 1NF or not"  is still a relevant one,  and it
has a clear answer.  Nothing nebulous about it.

(At the conceptual level, I don't model my data into anything but
attributes,  with associated entities and relationships.  That's a different
discussion, and 1NF need not enter that discussion).
Jan Hidders - 29 Aug 2007 06:39 GMT
> > >>>>>>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
> > >>>>>>>>>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 166 lines]
> Yes, I realize it is not a db-tuple, because if one relaxes 1NF then
> one doesn't have a db-relation at all.

That is irrelevant. In a NFNF setting tuples are still defined as a
certain kind of function, and what you gave is not a function.

> That set-valued element still
> represents a proposition however, and is in fact a relation in the
> true mathematical sense. I find this representation interesting
> because a JOIN becomes a union of these elements, and a natural join
> is generated by default as one would expect.

That depends a little on what one would expect. ;-) One elegant
definition of the natural join of two relations R and S is for example
{ t1 + t2 | t1 in R, t2 in S, t1 + t2 is a tuple }. If you change the
definiton of tuple as you propose this doesn't work anymore. Of course
it is not hard to come up with a definition that does generalize the
natural join correctly.

-- Jan Hidders
JOG - 29 Aug 2007 12:05 GMT
> > > >>>>>>>>>>>I am still fighting with the theoretical underpinning for 1NF. As
> > > >>>>>>>>>>>such, any comments would be greatfully accepted. The reason for my
[quoted text clipped - 169 lines]
> That is irrelevant. In a NFNF setting tuples are still defined as a
> certain kind of function, and what you gave is not a function.

Aye, I am aware I have dropped the correspondence between a tuple and
a finite partial function, and left a unrestricted mathematical
relation in its place. What I really wanted to explore were the
connotations of doing this.

> > That set-valued element still
> > represents a proposition however, and is in fact a relation in the
[quoted text clipped - 6 lines]
> { t1 + t2 | t1 in R, t2 in S, t1 + t2 is a tuple }. If you change the
> definiton of tuple as you propose this doesn't work anymore.

Well hey, the definition of a tuple in terms of RM is a bit quirky
anyhow (nevermind cross-product). Why let Codd have all the fun ;)

> Of course it is not hard to come up with a definition that does generalize the
> natural join correctly.
>
> -- Jan Hidders
paul c - 29 Aug 2007 17:23 GMT
...
>>>> That is not a tuple. A tuple would be:
>>>> {(Color, {Yellow, Green}), (Type, earth)}
[quoted text clipped - 23 lines]
>> Of course it is not hard to come up with a definition that does generalize the
>> natural join correctly.

I too am interested in seeing it explored even though it might require
discarding some conventional ideas.  How "natural join" might work does
seem a basic question since inferencing is a main attraction of Codd's
model that most people understand fairly readily, just as they dig his
table presentation idea.  Maybe a different kind of group/ungroup and it
would be important to define "insert" as well.  I'm not pre-supposing
that if it is feasible that it wouldn't end up being a way to implement
a physical layer rather than the logical view a user sees.

p
David Cressey - 29 Aug 2007 06:10 GMT
> Okay, sure. But then to be able to query for green and yellow
> individually one must employ a further relation encoding two more
[quoted text clipped - 3 lines]
> individual colours (which is what I was inferring when I initially
> said a new one was being added).

It took me a while to realize that what you meant from your original
description was that
"a green and yellow wire means earth".  I had thought you meant "a green
wire means earth" and "a yellow wire means earth".   Pardon me for being
dense.

Clearly what we have here is not a domain of colors,  but a domain of color
codes,  where a color code contains one or more colors,  and maybe a "thick
or thin" qualifier on each color.

It's not clear to me why you need to able to query on simple colors, unless
you need to decompose the color coding scheme into its constituent parts for
some reason.

There are lot of code domains where each code is made up of a set of more
primitive elements.
Perhaps a very relevant one might be "character code".  If I have the
following primitive elements:

B1, B2, B4, B8, B16, B32, B64, B128
(which might be an odd way of labelling bits 0 through 7 of a byte),  I can
think of the character code for 'A' as being B64+B1.  Now I could query on
all the character codes without necessarily having an operator that would
yield "all the codes that include B1".

I think that the colors,  as constituents of color codes, play the same role
as bits, as constituents of character codes.  Do you agree?
JOG - 29 Aug 2007 11:55 GMT
> > Okay, sure. But then to be able to query for green and yellow
> > individually one must employ a further relation encoding two more
[quoted text clipped - 31 lines]
> I think that the colors,  as constituents of color codes, play the same role
> as bits, as constituents of character codes.  Do you agree?

Yes. I mean no. No, yes. Gnngh ;)

Ok, of course I understand your point - a wire can be viewed as having
a colour code, which itself has constituent parts. But its just one
interpretation right. I am still seeing a difference between the
propositions:
* There is a colour-code "yellow and green" that denotes "earth".
* The casing of an earth wire features the colour yellow and the
colour green.

Its just like the difference between the propositions:
* My office is B42
* My office is on floor B, room 42.

There are instances where I may well want to encode as the second
proposition forms. And /if/ that were the case (iff), well 1NF is
precluding me from doing this in terms of the wire example.
Bob Badour - 29 Aug 2007 12:49 GMT
>>>Okay, sure. But then to be able to query for green and yellow
>>>individually one must employ a further relation encoding two more
[quoted text clipped - 49 lines]
> proposition forms. And /if/ that were the case (iff), well 1NF is
> precluding me from doing this in terms of the wire example.

I disagree. You have already noted that 1NF allows this with exactly 2
relations (or with 1 relation and one or more operations on the color
code domain.)
JOG - 29 Aug 2007 14:16 GMT
> >>"JOG" <j...@cs.nott.ac.uk> wrote in message
>
[quoted text clipped - 57 lines]
> relations (or with 1 relation and one or more operations on the color
> code domain.)

True, I do see that, but it does so by requiring the invention of a
colour-code concept which isn't in the proposition "The casing of an
earth wire features the colour yellow and the colour green".
Brian Selzer - 29 Aug 2007 19:03 GMT
>> >>"JOG" <j...@cs.nott.ac.uk> wrote in message
>>
[quoted text clipped - 72 lines]
> colour-code concept which isn't in the proposition "The casing of an
> earth wire features the colour yellow and the colour green".

You have to consider the entire relation value: what about the propositions
(treating or exclusively, of course):

"The casing of a live wire features the colour brown or the colour red."

"The casing of a neutral wire features the colour blue or the colour black."

Write a predicate for the relation schema that when extentially quantified
and extended yields a set of atomic formulae that implies all three of the
propositions above. I think you'll find that the colour-code concept is in
that predicate.
JOG - 29 Aug 2007 22:21 GMT
> >> >>"JOG" <j...@cs.nott.ac.uk> wrote in message
>
[quoted text clipped - 84 lines]
> propositions above. I think you'll find that the colour-code concept is in
> that predicate.

I agree. I hold little stock with set based values so in RM I would go
for the addition of colour-code foreign key.

But what if we weren't tied to a traditional relational schema and
tweaked the system so it could allow propositions with more than one
role of the same name without decomposing them. As Jan pointed out
'tuples' are no longer functions - they would be unrestricted binary
relations (subsets of attribute x values). We could produce a
comparatively simpler and less cluttered schema, predicate in a very
similar manner as before, and with a few simple alterations could have
an equally effective WHERE mechanism. My concern however would be the
consequences to JOIN.
Brian Selzer - 30 Aug 2007 01:41 GMT
>> >> >>"JOG" <j...@cs.nott.ac.uk> wrote in message
>>
[quoted text clipped - 109 lines]
> an equally effective WHERE mechanism. My concern however would be the
> consequences to JOIN.

I'm not sure I understand what you are driving at.  In the example you
provided, it is the combinations of values from a simple domain that have
significance, regardless of whether they're wrapped in a single attribute or
not.  To me it doesn't make sense to have multiple attributes with the same
name--the attribute names correspond to free variables in a predicate: how
could you assign multiple values to the same variable?  But you can
certainly assign a set of values to a variable that expects a set of values,
since a set is a value!  And you can certainly have a predicate with free
variables that range over relations and free variables that range over
individuals--it's just that the predicate is no longer first order.
JOG - 30 Aug 2007 12:27 GMT
> >> "JOG" <j...@cs.nott.ac.uk> wrote in message
>
[quoted text clipped - 120 lines]
> name--the attribute names correspond to free variables in a predicate: how
> could you assign multiple values to the same variable?

Well consider it this way. If I have the propositions:

The person named Jim speaks the language English
The person named Jim speaks the language German
The person named Brian speaks the language English

I have three propositions, and hopefully we'd agree there are two
roles in these propositions: name and speaks_language. So in FOL I
could write these propositions as:
[P1] Name(x, Jim) -> speaks_language(x, English)
[P2] Name(x, Jim) -> speaks_language(x, English)
[P3] Name(x, Brian) -> speaks_language(x, English)

Are we agreed up to there? If so then [P1] ^ [P2] gives us (via
composition):
Name(x, Jim) -> speaks_language(x, English) ^ speaks_language(x,
English)

and we are left with a sentence that has two distinct roles, one of
which appears twice. All of this sort of thinking has been driven by a
distaste us having to add a magic 'header' component to a relation
(probably as a consequence of reading pascal's work), and the desire
to bring roles back into the equation.

> But you can
> certainly assign a set of values to a variable that expects a set of values,
> since a set is a value!  And you can certainly have a predicate with free
> variables that range over relations and free variables that range over
> individuals--it's just that the predicate is no longer first order.
Bob Badour - 30 Aug 2007 13:46 GMT
>>>>"JOG" <j...@cs.nott.ac.uk> wrote in message
>>
[quoted text clipped - 150 lines]
>>variables that range over relations and free variables that range over
>>individuals--it's just that the predicate is no longer first order.

Where did Germany go?
JOG - 30 Aug 2007 14:30 GMT
> >>"JOG" <j...@cs.nott.ac.uk> wrote in message
>
[quoted text clipped - 156 lines]
>
> Where did Germany go?

Good grief, the perils of cut and paste. It should of course have
been:

-----------------
[P1] Name(x, Jim) -> speaks_language(x, English)
[P2] Name(x, Jim) -> speaks_language(x, German)
[P3] Name(x, Brian) -> speaks_language(x, English)

Are we agreed up to there? If so then [P1] ^ [P2] gives us (via
composition):
Name(x, Jim) -> speaks_language(x, English) ^ speaks_language(x,
German)
-----------------

Was fur ein dummkopf....
David Cressey - 30 Aug 2007 15:26 GMT
> Well consider it this way. If I have the propositions:
>
[quoted text clipped - 19 lines]
> (probably as a consequence of reading pascal's work), and the desire
> to bring roles back into the equation.

Is the subject of speakers and languages contrived?  (I'm at risk of
becoming obsessive on the subject of  "contrived").  If it is,  I'd like to
suggest that we return to a classic contrived example for this newsgroup:
the subject of pizzas and toppings.

You can, if you like extend the topic to pizzas toppings and cheeses.    You
can then go to google groups and look up the history of the discussion of
the subject of pizzas, toppings, and cheeses in this newsgroup  (c.d.t.).

You will find an extensive discussion of the questions that arise when an
attribute value can be a set,  rather than just a single value.  Some of
that discussion makes sense to me.  Some of it is just pure blather.  There
are plenty of inputs from the MV sect of the NFNF religion.

My apologies, JOG, if you were a participant in those discussions.  My
memory fails me.  If not,  the principal value of recapitulating that
example,  rather than covering speakers and languages,  or cows and colors,
or wires and color codes,  is that a lot of the responses are already neatly
collected in google groups.  Those who do not learn from on line discussions
are condemned to repeat them.  (Apologies to Santayana enthusiasts).

> > But you can
> > certainly assign a set of values to a variable that expects a set of values,
> > since a set is a value!  And you can certainly have a predicate with free
> > variables that range over relations and free variables that range over
> > individuals--it's just that the predicate is no longer first order.

See above.
Neo - 30 Aug 2007 23:08 GMT
> I'd like to suggest that we return to
> a classic contrived example for this newsgroup:
> the subject of pizzas and toppings.

Below dbd script models two orders with pizzas, the second with
multiple cheeses and topping.

(new 'order)
(new 'pizza)
(new 'size)
(new 'crust)
(new 'cheese)
(new 'topping)
(new 'coke 'drink)

(; Create order#1 consisting of a small pizza)
(new 'order#1 'order)
(set (it) item (block (new)
                     (set  pizza instance (it))
                     (set+ (it) size 'small)
                     (set+ (it) crust 'thin)
                     (set+ (it) cheese 'mozzarella)
                     (set+ (it) topping 'veggies)
                     (it)))

(; Create order#2 consisting of a large pizza and 4 drinks)
(new 'order#2 'order)
(set (it) item (block (new)
                     (set  pizza instance (it))
                     (set+ (it) size 'large)
                     (set+ (it) crust 'thick)
                     (set+ (it) cheese 'mozzarella)
                     (set+ (it) cheese 'parmesan)
                     (set+ (it) topping 'sausage)
                     (set+ (it) topping 'pepperoni)
                     (set+ (it) topping 'olive)
                     (it)))
(set+ (it) item coke quantity '4)

(; Get orders with small pizza)
(; Gets order#1)
(and (get order instance *)
    (get * item (and (get pizza instance *)
                     (get * size small))))

(; Get orders with pizza with mozzarella cheese)
(; Gets order#1 and order#2)
(and (get order instance *)
    (get * item (and (get pizza instance *)
                     (get * cheese mozzarella))))

For details, see www.dbfordummies.com/example/ex305.asp
Brian Selzer - 30 Aug 2007 18:41 GMT
>> >> "JOG" <j...@cs.nott.ac.uk> wrote in message
>>
[quoted text clipped - 161 lines]
>
> Are we agreed up to there?

Not exactly.  What you have are the roles Name and Language which appear as
free variables in the predicate Speaks.  A sentence in FOL is a closed
formula, for example,

exists Name exists Language Speaks(Name,Language)

where each quantifier binds a free variables in Speaks.  Supposing that the
domains for Name and Language are,

Names {Jim, Brian, Sue} and
Languages {English, German, French}

respectively, an interpretation of the sentence gives,

Speaks(Jim,English) /\
Speaks(Jim,German) /\
~Speaks(Jim,French) /\
Speaks(Brian,English) /\
~Speaks(Brian,German) /\
~Speaks(Brian,French) /\
~Speaks(Sue,English) /\
~Speaks(Sue,German) /\
~Speaks(Sue,French)

Which under the closed world assumption becomes,

Speaks(Jim,English) /\
Speaks(Jim,German) /\
Speaks(Brian,English)

From this it can be deduced that Jim speaks both English and German, and
that Jim and Brian both speak English.  Under the domain closure assumption,
it can be deduced that Sue does not exist, and that the only languages that
exist are English and German.  Sue can exist and French can exist, but since
neither are referenced, neither does.  It should be noted that just because
Brian exists and German exists doesn't mean that Brian speaks German.  The
truth value for Speaks(Brian,German) was assigned false under the given
interpretation.

> If so then [P1] ^ [P2] gives us (via
> composition):
[quoted text clipped - 13 lines]
>> variables that range over relations and free variables that range over
>> individuals--it's just that the predicate is no longer first order.
JOG - 30 Aug 2007 19:22 GMT
> >> "JOG" <j...@cs.nott.ac.uk> wrote in message
>
[quoted text clipped - 171 lines]
>
> exists Name exists Language Speaks(Name,Language)

Well that is certainly one possibility, and of course I realise that
it is how Codd prescribed encoding a proposition in his 1969 paper. I
am suggesting that:

Ex has_Name(x, persons_name) -> speaks_language(x, language)

is an equally valid, if not better option. Why? Because we can
explicitly incorporate attribute names (which remember Codd just
bolted on, redefining a mathematical relation in the process), and
secondly the key is clearly expressed (all attributes to the left of
the ->) - there is no need for a magic header.

> where each quantifier binds a free variables in Speaks.  Supposing that the
> domains for Name and Language are,
[quoted text clipped - 13 lines]
> ~Speaks(Sue,German) /\
> ~Speaks(Sue,French)

Aye, but where have those all important attribute names disappeared
to?

> Which under the closed world assumption becomes,
>
[quoted text clipped - 10 lines]
> truth value for Speaks(Brian,German) was assigned false under the given
> interpretation.

I understand this view, but all CWA (imo) should do is tell us that
no /propositions/ exist discussing sue. Surely we want a database
that, if we ask it whether sue exists, responds "not as far as I've
been told" instead of "no definitely not. no sireee. never ever"?

> > If so then [P1] ^ [P2] gives us (via
> > composition):
[quoted text clipped - 13 lines]
> >> variables that range over relations and free variables that range over
> >> individuals--it's just that the predicate is no longer first order.
Bob Badour - 30 Aug 2007 20:14 GMT
>>>>"JOG" <j...@cs.nott.ac.uk> wrote in message
>>
[quoted text clipped - 183 lines]
> secondly the key is clearly expressed (all attributes to the left of
> the ->) - there is no need for a magic header.

How does it express multiple candidate keys?
JOG - 30 Aug 2007 21:33 GMT
> >>"JOG" <j...@cs.nott.ac.uk> wrote in message
>
[quoted text clipped - 189 lines]
>
> How does it express multiple candidate keys?

Bloody good question sir. I hadn't really thought about it - there is
no notion of a key in predicate logic. In fact if one observes
multiple keys you've probably encoding more than one proposition.  I
dinked about google for a common example and ended up with: {empID,
SSN, city, zip} where empID and SSN are both candidates. In that case
we've actually got:

empID -> SSN ^ city ^ zip
SSN -> empID ^  city ^ zip

Off the top of my head, I'd say record either format and specify in
the set's intension that SSN<-> empID.
Bob Badour - 30 Aug 2007 21:43 GMT
>>>>"JOG" <j...@cs.nott.ac.uk> wrote in message
>>
[quoted text clipped - 202 lines]
> Off the top of my head, I'd say record either format and specify in
> the set's intension that SSN<-> empID.

I was thinking more along the lines of say a schedule relation:

Teacher    Room    Time
Jim    100    4:00pm
Bob    100    3:00pm
Bob    200    4:00pm
Jim    200    3:00pm

It has two candidate keys {Teacher,Time} and {Room,Time}
Neo - 30 Aug 2007 22:42 GMT
> Teacher Room    Time
> Jim     100     4:00pm
> Bob     100     3:00pm
> Bob     200     4:00pm
> Jim     200     3:00pm

Below dbd script models above. In addition, Jim, Bob and Brian teach
in room 300 at 5:00 pm.

(new 'teacher)
(new 'room)

(new 'tuple1 'TeacherRoomTime)
(set+ (it) teacher 'jim)
(set+ (it) room '100)
(set+ (it) time '1600)

(new 'tuple2 'TeacherRoomTime)
(set+ (it) teacher 'bob)
(set+ (it) room '100)
(set+ (it) time '1500)

(new 'tuple3 'TeacherRoomTime)
(set+ (it) teacher 'bob)
(set+ (it) room '200)
(set+ (it) time '1600)

(new 'tuple4 'TeacherRoomTime)
(set+ (it) teacher 'jim)
(set+ (it) room '200)
(set+ (it) time '1500)

(new 'tuple5 'TeacherRoomTime)
(set+ (it) teacher 'jim)
(set+ (it) teacher 'bob)
(set+ (it) teacher 'brian)
(set+ (it) room '300)
(set+ (it) time '1700)

(; Get teachers for room 300 at time 1700)
(; Gets jim, bob and brian)
(get (& (get TeacherRoomTime instance *)
          (get * room 300)
          (get * time 1700))
      teacher *)
JOG - 30 Aug 2007 23:46 GMT
> >>>>"JOG" <j...@cs.nott.ac.uk> wrote in message
>
[quoted text clipped - 212 lines]
>
> It has two candidate keys {Teacher,Time} and {Room,Time}

Ahhh, the good old irreducible tuple, overlapping superkey example.
Its been too long I tell you, too long. A good example. However,
theres a much better reasoning against my 'left of the -> is the key'
reasoning. I like to call it the 'that makes no sense whatsoever'
complaint. Or the 'thats enough whisky for you' retort. I wrote nay
but a few posts back:

[P1] Name(x, Jim) -> speaks_language(x, English)
[P2] Name(x, Jim) -> speaks_language(x, German)

Name:Jim appears twice as an antecedent. Genius. Hardly a key then. It
is an enigmatic soul who manages to disprove himself in his own
examples :/

I'm currently standing by the fact that its good to have attribute
names explicitly stated though.
Brian Selzer - 31 Aug 2007 00:25 GMT
[big snip]
>> > Are we agreed up to there?
>>
[quoted text clipped - 16 lines]
> secondly the key is clearly expressed (all attributes to the left of
> the ->) - there is no need for a magic header.

I won't go here, since you've already realized that Name isn't a key.  It is
true that Name multidetermines Language, however.

>> where each quantifier binds a free variables in Speaks.  Supposing that
>> the
[quoted text clipped - 17 lines]
> Aye, but where have those all important attribute names disappeared
> to?

They are encoded in the predicate Speaks.  It is not important to know the
exact composition of the predicate Speaks, other than that the only free
variables that appear in it are Name and Language.  The parameterized
notation above is easier to write yet conveys the same information than

(Speaks such that Name := Jim and Language := English) /\
(Speaks such that Name := Jim and Language := German) /\
...etc....

>> Which under the closed world assumption becomes,
>>
[quoted text clipped - 20 lines]
> that, if we ask it whether sue exists, responds "not as far as I've
> been told" instead of "no definitely not. no sireee. never ever"?

But that's more of an open world view, which shades the meaning of every
aspect of the database.  I'm not saying it is wrong, but it should be
understood, then, that the content in the database is limited to what is
known to be true instead of what is actually true.  It also provides a basis
for 3VL, since by accepting the open world interpretation, you are accepting
that there can be facts that are not known.

Also, it's not "never ever," but rather "definitely not in this picture of
reality."

>> > If so then [P1] ^ [P2] gives us (via
>> > composition):
[quoted text clipped - 14 lines]
>> >> variables that range over relations and free variables that range over
>> >> individuals--it's just that the predicate is no longer first order.
Neo - 30 Aug 2007 22:18 GMT
> I have three propositions, and hopefully we'd agree there are two
> roles in these propositions: name and speaks_language. So in FOL I
> could write these propositions as:
> [P1] Name(x, Jim) -> speaks_language(x, English)
> [P2] Name(x, Jim) -> speaks_language(x, German)
> [P3] Name(x, Brian) -> speaks_language(x, English)

In dbd, the above are expressed as:

(new 'speak 'verb)

(new 'english 'language)
(new 'german 'language)

(new 'jim 'person)
(set jim speak english)
(set jim speak german)

(new 'brian 'person)
(set brian speak english)

(; Get persons who speaks english)
(; Gets jim and brian)
(get * speak english)

(; Get persons who speak english and german)
(; Ges jim)
(& (get * speak english)
   (get * speak german))
Bob Badour - 30 Aug 2007 01:42 GMT
>>>>>>"JOG" <j...@cs.nott.ac.uk> wrote in message
>>
[quoted text clipped - 97 lines]
> an equally effective WHERE mechanism. My concern however would be the
> consequences to JOIN.

What would you offer in place of the RM's logical identity.
JOG - 30 Aug 2007 12:34 GMT
> >>Write a predicate for the relation schema that when extentially quantified
> >>and extended yields a set of atomic formulae that implies all three of the
[quoted text clipped - 15 lines]
>
> What would you offer in place of the RM's logical identity.

Nothing. I am utterly convinced by Date et al's arguments in favour of
logical identity. (Why would I need to replace it?) I just wanna model
propositions, and they are always identified by their contents.
Bob Badour - 30 Aug 2007 13:44 GMT
>>>>Write a predicate for the relation schema that when extentially quantified
>>>>and extended yields a set of atomic formulae that implies all three of the
[quoted text clipped - 19 lines]
> logical identity. (Why would I need to replace it?) I just wanna model
> propositions, and they are always identified by their contents.

In: {{(Color: green), (Color: yellow), (Type: earth)}}

What provides logical identity?
JOG - 30 Aug 2007 14:27 GMT
> >>>>Write a predicate for the relation schema that when extentially quantified
> >>>>and extended yields a set of atomic formulae that implies all three of the
[quoted text clipped - 23 lines]
>
> What provides logical identity?

I may be misunderstanding you, but let me take a stab. The identity of
any set of course lies in its elements (i.e. in this of a single
propositions, the ordered pairs). Given we know Colors are the
antecedents in the proposition we are modelling, this has to be been
defined in the collectivizing predicate for the whole collection of
rows. We also know therefore there may not exist another set of pairs
containing the same Colors, so we can identify the whole proposition
through examination of just those roles. All works just as per normal
in RM.  Is this what you meant?
Bob Badour - 30 Aug 2007 14:55 GMT
>>>>>>Write a predicate for the relation schema that when extentially quantified
>>>>>>and extended yields a set of atomic formulae that implies all three of the
[quoted text clipped - 33 lines]
> through examination of just those roles. All works just as per normal
> in RM.  Is this what you meant?

I haven't got a clue what you said. In the RM, every value is uniquely
identifiable by the combination of relation name, attribute name and any
candidate key value. That's logical identity as it was originally
spelled out.

Two values above have the same attribute name.
JOG - 30 Aug 2007 16:41 GMT
> >>>>>>Write a predicate for the relation schema that when extentially quantified
> >>>>>>and extended yields a set of atomic formulae that implies all three of the
[quoted text clipped - 35 lines]
>
> I haven't got a clue what you said.

I just regurgitated leibniz identity.

> In the RM, every value is uniquely
> identifiable by the combination of relation name, attribute name and any
> candidate key value. That's logical identity as it was originally
> spelled out.
>
> Two values above have the same attribute name.

Now you've lost me. A "value" is not identifiable by its relation name
and attribute name. This makes no sense to me. Where in predicate
logic does that come from? A value is just a value. It is identifiable
in its own right as being an individual from a domain.

An individual piece of /data/ however (which is perhaps what you mean
by a value) has an identity made up of a combination of an attribute
name and a corresponding value.  One needs both to identify the data
item. A proposition in turn is identifiable by its contents, which is
a set of those data items. Regards, J.
Bob Badour - 30 Aug 2007 17:00 GMT
>>>>>>>>Write a predicate for the relation schema that when extentially quantified
>>>>>>>>and extended yields a set of atomic formulae that implies all three of the
[quoted text clipped - 49 lines]
> logic does that come from? A value is just a value. It is identifiable
> in its own right as being an individual from a domain.

I mispoke. "Any value represented in a relvar"

> An individual piece of /data/ however (which is perhaps what you mean
> by a value) has an identity made up of a combination of an attribute
> name and a corresponding value.  One needs both to identify the data
> item. A proposition in turn is identifiable by its contents, which is
> a set of those data items. Regards, J.

I repeat: two pieces of data have the same name, Color.
JOG - 30 Aug 2007 17:58 GMT
> >>>>>>>>Write a predicate for the relation schema that when extentially quantified
> >>>>>>>>and extended yields a set of atomic formulae that implies all three of the
[quoted text clipped - 51 lines]
>
> I mispoke. "Any value represented in a relvar"

Well it is still just a value whether its in a relvar or not - it
needs no extra identity. A database table is just a set of
propositions. A proposition is encoded as a set of attribute-value
pairs. That's it surely?

Any notion of identity is as defined by set theory.

> > An individual piece of /data/ however (which is perhaps what you mean
> > by a value) has an identity made up of a combination of an attribute
[quoted text clipped - 3 lines]
>
> I repeat: two pieces of data have the same name, Color.

Well no - a piece of data doesn't have a 'name' does it? It's just a
combination of attribute and value. The number-7. name-Fred. color-
red. A datum's identity is defined by the /combination/ of these two
parts, and that alone - not by a label, or an alias, or an OID (as I'm
sure you'd agree).

And if two datum share one of these parts (the attribute component) ,
well so what - they are still identifiably different things.   I'm
scratching my head to see the problem you are envisaging here.
Bob Badour - 30 Aug 2007 18:33 GMT
>>>>>>>>>>Write a predicate for the relation schema that when extentially quantified
>>>>>>>>>>and extended yields a set of atomic formulae that implies all three of the
[quoted text clipped - 72 lines]
> parts, and that alone - not by a label, or an alias, or an OID (as I'm
> sure you'd agree).

No, I don't agree. I suggest you see the definition of Logical Identity
in Codd's 12 rules.
JOG - 30 Aug 2007 19:00 GMT
> >>>>>>>>>>Write a predicate for the relation schema that when extentially quantified
> >>>>>>>>>>and extended yields a set of atomic formulae that implies all three of the
[quoted text clipped - 75 lines]
> No, I don't agree. I suggest you see the definition of Logical Identity
> in Codd's 12 rules.

Well, I have to contest again - you are no doubt referring to "rule
2:The guaranteed access rule", and that makes no reference to the term
identity (...and that is what you asked me about.) Rule 2 is stating :
"every individual value in the database must be logically addressable
by specifying the name of the table, the name of the column and the
primary key value of the containing row."

Logically "addressable" - that's a very different kettle of fish to
identity. In your original question did you mean to ask then: "What
provides logical addressibality?" if one has two attributes playing
the same role? I won't respond to that in advance, because I don't
want to put words into your mouth. Regards, J.
Bob Badour - 30 Aug 2007 20:12 GMT
>>>>>>>>>>>>Write a predicate for the relation schema that when extentially quantified
>>>>>>>>>>>>and extended yields a set of atomic formulae that implies all three of the
[quoted text clipped - 88 lines]
> the same role? I won't respond to that in advance, because I don't
> want to put words into your mouth. Regards, J.

Yes. If you prefer, "logical addressibility". The term I have known for
quite some time is "logical identity", but if you prefer, we can call it
"logical addressibility".
JOG - 30 Aug 2007 21:13 GMT
> Yes. If you prefer, "logical addressibility".

Identity has a very specific meaning in set theory, hence the
confusion. Addressibility makes a lot more sense to me.

> The term I have known for
> quite some time is "logical identity", but if you prefer, we can call it
> "logical addressibility".

The simple answer to your question is that allowing multiple
attributes would break the guaranteed access rule. (Its  probably
worth pointing out that its seen a lot of contention anyhow).

To me the point of the rule is to preclude non-logical access via
pointers and OID's, and in that I agree with it wholeheartedly. It was
also aimed at proscribing set based values, and I agree with that too.
Wouldn't change a thing there - and relaxing 1NF doesn't affect that
as far as I can tell.

What negative impact do you envision? An insert it unaffected, as is
delete, and an update still just replaces one proposition with
another. The only real consequence I can see would be to where-clauses
which would require a tweak.
Brian Selzer - 31 Aug 2007 03:13 GMT
>> >>>>>>>>>>Write a predicate for the relation schema that when extentially
>> >>>>>>>>>>quantified
[quoted text clipped - 99 lines]
> by specifying the name of the table, the name of the column and the
> primary key value of the containing row."

Pardon me for being a stickler about this.  I got this from dbdebunk:

"Each and every datum (atomic value) is guaranteed to be logically
accessible by resorting to a combination of table name, primary key value
and column name."

A datum is an /atomic/ value, not an individual value.  Atomic--implying
that it cannot be separated into components.

So having more than one value for a particular role violates the guaranteed
access rule either way you look at it.  If the column names aren't unique,
then you can't access a particular datum by a column name.  If a value is a
collection of component values, then you can't access a particular datum
(component value), but only the collection in which it is contained.

But you're right that accessibility has nothing to do with identity.  A
value can appear many times in many different tuples and in many different
relations.  Logical identity ensures that no matter how many times a value
appears in a database, it always maps to the same individual in the universe
of discourse.

> Logically "addressable" - that's a very different kettle of fish to
> identity. In your original question did you mean to ask then: "What
> provides logical addressibality?" if one has two attributes playing
> the same role? I won't respond to that in advance, because I don't
> want to put words into your mouth. Regards, J.
JOG - 31 Aug 2007 11:37 GMT
>[snip]
> "JOG" <j...@cs.nott.ac.uk> wrote in message
[quoted text clipped - 6 lines]
>
> Pardon me for being a stickler about this.  I got this from dbdebunk:

no worries - stickling is fine.

> "Each and every datum (atomic value) is guaranteed to be logically
> accessible by resorting to a combination of table name, primary key value
> and column name."

Coupla things - Date and Darwen argue against the idea of atomicity,
and they also complain about the use of 'primary key'. I also think
Codds use of the term datum is incorrect. Throughout history data has
required an attribute-value pair. The word is derived from the latin
for 'statement of fact', its use in science always requires that the
value is described. Its common sense really - if we don't know what a
value means, well its just noise. Imagine the binary value 1000001.
Ascii(1000001) makes it an A, Number1000001) makes it 65, etc.

Either way, this doesn't matter as long as we know what each other
mean.

> A datum is an /atomic/ value, not an individual value.  Atomic--implying
> that it cannot be separated into components.
[quoted text clipped - 4 lines]
> collection of component values, then you can't access a particular datum
> (component value), but only the collection in which it is contained.

Well I've never suggested multiple values contained in a collection.
But yes as I said, multiple roles does break the guaranteed access
rule. My question is now (in the continuuing hunt for the theory
behind 1NF)  is why on earth would that be a problem? I don't see any
affect on the relational algebra.

> But you're right that accessibility has nothing to do with identity.  A
> value can appear many times in many different tuples and in many different
[quoted text clipped - 7 lines]
> > the same role? I won't respond to that in advance, because I don't
> > want to put words into your mouth. Regards, J.
Brian Selzer - 31 Aug 2007 12:21 GMT
>>[snip]
>> "JOG" <j...@cs.nott.ac.uk> wrote in message
[quoted text clipped - 42 lines]
> behind 1NF)  is why on earth would that be a problem? I don't see any
> affect on the relational algebra.

What about restriction?

R  {{A:4, A:5, B:3},
   {A:3,A:4,B:4}}

R WHERE A = 3?
   Do you return an empty relation, or {{A:3,A:4,B:4}}?
   If A = 3 is true, then A = 4 is also true, but shouldn't that be
   impossible?

   If A were a set, then you could write,
       R WHERE 3 IN A

R WHERE A = 4 AND A = 5?
   Shouldn't A = 4 AND A = 5 always return false?

>> But you're right that accessibility has nothing to do with identity.  A
>> value can appear many times in many different tuples and in many
[quoted text clipped - 10 lines]
>> > the same role? I won't respond to that in advance, because I don't
>> > want to put words into your mouth. Regards, J.
JOG - 31 Aug 2007 12:47 GMT
> >>[snip]
> >> "JOG" <j...@cs.nott.ac.uk> wrote in message
[quoted text clipped - 52 lines]
>     If A = 3 is true, then A = 4 is also true, but shouldn't that be
>     impossible?

Well in my own musings, the former. I viewed the WHERE clause in the
light of set membership, so one asks whether the tuple contains the
pair (A,3):
i.e. WHERE contains(A, 3)  => { { (A,3), (A,4), (B,4) } }
     WHERE contains(A, 4)  => { { (A,3), (A,4), (B,4) },   { (A,4),
(A,5), (B,2) } }

But you could also ask for existence of tuples.
i.e WHERE exists(A, 1) => {}
which is asking to return only propositions where there is only 1 pair
featuring A as the attribute.

Or generally:
i.e. WHERE exists(Role, x) => { p &epsilon; R | &exist;x (Role, x)
&epsilon; p }

>     If A were a set, then you could write,
>         R WHERE 3 IN A
[quoted text clipped - 16 lines]
> >> > the same role? I won't respond to that in advance, because I don't
> >> > want to put words into your mouth. Regards, J.
JOG - 31 Aug 2007 12:53 GMT
> Or generally:
> i.e. WHERE exists(Role, x) => { p &epsilon; R | &exist;x (Role, x)
> &epsilon; p }

so much for trying to use html char codes. Lets try again:

Or generally:
WHERE exists(Role, 1) => { p E R | EXISTS!x (Role, x) E p }
WHERE exists(Role, n) => { p E R | EXISTS=n x (Role, x) E p }
Bob Badour - 31 Aug 2007 13:29 GMT
>>[snip]
>>"JOG" <j...@cs.nott.ac.uk> wrote in message
[quoted text clipped - 40 lines]
> behind 1NF)  is why on earth would that be a problem? I don't see any
> affect on the relational algebra.

You earlier suggested that union would suffice for join. But supposing

{{(Color: green), (Color: yellow), (Type: earth)}
,{(Color: black), (Type: neutral)}}

is valid, then the following is valid too:

{{(Color: green), (Color: yellow), (Color: black)
, (Type: earth), (Type: neutral)}}

Which, of course, is a union of two of your propositions.

How does that not affect the algebra?
David Cressey - 31 Aug 2007 13:58 GMT
> Well I've never suggested multiple values contained in a collection.
> But yes as I said, multiple roles does break the guaranteed access
> rule. My question is now (in the continuuing hunt for the theory
> behind 1NF)  is why on earth would that be a problem? I don't see any
> affect on the relational algebra.

I honestly think that the impetus behind "normalization"  in the Codd 1970
paper is more of a  stopgap than a theory.  (I'm not familiar with the 1969
paper,  and I only read the 1970 paper after I began participating in the
discussions in c.d.t.)  In the 1970 paper,  Codd suggests that it may be
worthwhile to consider the subset of schemas that contain only atomic
attributes.  (He didn't use the word "schemas", but I hope I can use it
without introducing confusion.)

He pointed out that such a restriction did not thereby reduce the
expressiveness to the system,  in that for every unnormalized schema, there
existed an equivalent normalized schema.  "normalized"  in the 1970 paper is
called 1NF in later writings, once further normal forms were discovered.

There is one other piece of the 1NF definition in the 1970 paper,  the "no
duplicates rule".  The no duplicates rule has to do with the representation
of a relation,  and not with a relation itself.  Codd imagined (correctly)
that the first relational database systems would use records to represent
tuples and (virtual) arrays of records to represent relations.  In a
relation, there is no such thing as "a tuple appearing twice".  However, in
an array of records,  there is such a thing as two of the records having
identical contents.  Codd ruled that out as a practical stop gap,  in order
to prevent the implementations from diverging from the properties of
mathematical relations in an unnecessary and harmful way.  This is my
reading of the 1970 paper,  in regard to 1NF theory.

There's a connection between the "atomic values" rule and the "no duplicates
rule",  at the implementation level.

consider the following fact:

Jack speaks English and German.

Let's say we are about to include this fact in a relation stored somewhere
in a relational database,  and that one of the columns of a relational table
is "set of languages spoken".

Further, let's say that there is already a tuple in the relation with the
following fact stored:

Jack speaks German and English.

As a practical matter, in terms of the representation of data inside a
database,  it can be extraordinarily difficult to ascertain that these two
propositions, together, violate the "no duplicates rule"

Notice that my focus has been entirely on the implementation,  and not on
the relational algebra itself.  With regard to the relational algebra
itself,  I believe your understanding is correct.

So what the heck are implementation oriented issues doing in the 1970 paper?
I believe Codd wanted to get across two main ideas:  building a system for
relational databases would be a good idea.  And building such a system was
also feasable.  It's for this second reason that I believe Codd added some
material that is primarily about implementation, rather than about the power
of relational algebra itself.

This is my insight, such as it is.  I hope it helps.
Bob Badour - 31 Aug 2007 14:22 GMT
>>Well I've never suggested multiple values contained in a collection.
>>But yes as I said, multiple roles does break the guaranteed access
[quoted text clipped - 26 lines]
> mathematical relations in an unnecessary and harmful way.  This is my
> reading of the 1970 paper,  in regard to 1NF theory.

I disagree slightly with your interpretation. Codd did not disallow
physical duplication. Duplicates in sets have no meaning, thus:
{ 1, 2, 1 } = { 1, 2 } = { 2, 2, 2, 2, 1 } etc.

At the logical level, duplicates count only once, and the physical
structure conveys no meaning. By divorcing physical structure from
logical interpretation, one enables physical independence. Thus
duplicating information in an index alters the performance
characteristics without changing the meaning of queries etc.

> There's a connection between the "atomic values" rule and the "no duplicates
> rule",  at the implementation level.
[quoted text clipped - 28 lines]
>
> This is my insight, such as it is.  I hope it helps.

Again, I disagree slightly. While I do not know Codd's intent other than
the intent expressed in his works, his observation that one can
normalize quite mechanistically provides an implementation for the RVA.

Of course, that won't necessarily protect one from the update anomalies
the higher normal forms address.
David Cressey - 31 Aug 2007 19:12 GMT
> >>Well I've never suggested multiple values contained in a collection.
> >>But yes as I said, multiple roles does break the guaranteed access
[quoted text clipped - 36 lines]
> duplicating information in an index alters the performance
> characteristics without changing the meaning of queries etc.

It looks like I was wrong.  I scanned the 1970 paper again and didn't find a
mention of the "No duplicate rule".  I must have confused the 1970 paper
with some of the writings that refer to it.

The case I was mentioning would be more like

{1, 2} = {2, 1}

than the ones you outlined.  The only near reference to this in the 1970
paper is in the discussion of "Project",  where he says that after
eliminating come columns,  duplicates left in the result table must be
eliminated.  I infer from this that he did not intend to allow duplicate
rows to be passed response to a  query, at least in the case of a project.

My comments have nothing to do with indexes.

> > There's a connection between the "atomic values" rule and the "no duplicates
> > rule",  at the implementation level.
[quoted text clipped - 32 lines]
> the intent expressed in his works, his observation that one can
> normalize quite mechanistically provides an implementation for the RVA.

Hmm....
> Of course, that won't necessarily protect one from the update anomalies
> the higher normal forms address.

Agreed.
JOG - 31 Aug 2007 14:28 GMT
> > Well I've never suggested multiple values contained in a collection.
> > But yes as I said, multiple roles does break the guaranteed access
[quoted text clipped - 59 lines]
>
> This is my insight, such as it is.  I hope it helps.

I think there is a lot of truth in what you say. Codd was impressive
in his ability to be theoretical and pragmatic. His success comes in
part I believe to his social awareness of the current climate in which
he was working.
Brian Selzer - 01 Sep 2007 02:31 GMT
>> Well I've never suggested multiple values contained in a collection.
>> But yes as I said, multiple roles does break the guaranteed access
[quoted text clipped - 10 lines]
> attributes.  (He didn't use the word "schemas", but I hope I can use it
> without introducing confusion.)

The whole point of 1NF boils down to the following two sentences from the
June, 1970 article: "The adoption of a relational model of data, as
described above, permits the defelopment of a universal data sublanguage
based on an applied predicate calculus.  A first-order predicate calculus
suffices if the collection of relations is in normal form."  Therefore, the
impetus behind "normalization" is to model the universal data sublanguage
after a first-order logic rather than some higher order logic.

> He pointed out that such a restriction did not thereby reduce the
> expressiveness to the system,  in that for every unnormalized schema,
[quoted text clipped - 54 lines]
>
> This is my insight, such as it is.  I hope it helps.
JOG - 01 Sep 2007 11:21 GMT
> >> Well I've never suggested multiple values contained in a collection.
> >> But yes as I said, multiple roles does break the guaranteed access
[quoted text clipped - 18 lines]
> impetus behind "normalization" is to model the universal data sublanguage
> after a first-order logic rather than some higher order logic.

That's a great quote - probably the best reasoning i've seen for 1NF.
Thanks.

> > He pointed out that such a restriction did not thereby reduce the
> > expressiveness to the system,  in that for every unnormalized schema,
[quoted text clipped - 54 lines]
>
> > This is my insight, such as it is.  I hope it helps.
dawn - 02 Sep 2007 21:36 GMT
> > "David Cressey" <cresse...@verizon.net> wrote in message
>
[quoted text clipped - 25 lines]
> That's a great quote - probably the best reasoning i've seen for 1NF.
> Thanks.

I've suggested that I can do much simpler things in my code if the
user will put up with fewer features before.  Sometimes they go for
that.  There are trade-offs.  That might make the code more reliable,
faster, more maintainable, less costly to write, or this or that, but
users don't just say "sure, whatever is easiest within the code, the
machine, or the theory, that is all I need."

In fact, most of us would not suggest that users roll over and play
dead in their efforts to improve their jobs just because of some
complexity in providing them the features they desire.  Similarly,
users of data models should not just accept first-order logic as the
end of the story.  Those users are working to get a job done, with
requirements to do such things as ripple deletes, ordering data,
blending multiple propositions into one in various ways, handling many-
to-many relationships, etc.

So, even if it were the beginning of the story, the ability to employ
first order logic is not the end of the story on this one, I would
hope. Don't give up.  The industry needs you.  Cheers!  --dawn  (going
back to lurking, not to worry).

<snip>
David Cressey - 01 Sep 2007 13:17 GMT
> >> Well I've never suggested multiple values contained in a collection.
> >> But yes as I said, multiple roles does break the guaranteed access
[quoted text clipped - 18 lines]
> impetus behind "normalization" is to model the universal data sublanguage
> after a first-order logic rather than some higher order logic.

Thank you, Brian.  This clarifies the issue enormously  for me