Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / March 2004

Tip: Looking for answers? Try searching our database.

object algebra

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Jean Morissette - 20 Feb 2004 14:35 GMT
Hi,
In query processing, a ODBMS parse a OQL request in a query tree
corresponding to object algebra.  But, unlike relational algebra, there
is no standard object algebra.  So, could you help me to find some good
object algebra please (example: link to research university)?
Thanks
Jean Morissette
Tom Hester - 20 Feb 2004 15:59 GMT
Try a google search on 'object-oriented algebra'.  I did and got 153,000
hits.

> Hi,
> In query processing, a ODBMS parse a OQL request in a query tree
[quoted text clipped - 3 lines]
> Thanks
> Jean Morissette
Christopher Browne - 20 Feb 2004 16:00 GMT
> Hi,
> In query processing, a ODBMS parse a OQL request in a query tree
> corresponding to object algebra.  But, unlike relational algebra, there
> is no standard object algebra.  So, could you help me to find some good
> object algebra please (example: link to research university)?

The only work I am aware of that presents any sort of "object
calculus" is _A Theory of Objects_, by Martin Abadi and Luca Cardelli.

Benjamin Pierce's book on Category Theory is probably also somewhat
relevant.

But neither of these appears to be necessarily relevant to OQL.

The fact that, despite 20-odd years of publishing on "object oriented"
programming, there are only a very few books that try to treat the
relationships between objects in a robust mathematical manner, should
be quite disturbing.

That seems to me to be an even worse state of affairs than the typical
paucity of interest in sound theory amongst those that use databases.
Signature

select 'cbbrowne' || '@' || 'acm.org';
http://cbbrowne.com/info/wp.html
Twice five syllables
Plus seven can't say much but
That's haiku for you.

Alfredo Novoa - 20 Feb 2004 16:35 GMT
>Benjamin Pierce's book on Category Theory is probably also somewhat
>relevant.

I haven't found the relevance.

>The fact that, despite 20-odd years of publishing on "object oriented"
>programming, there are only a very few books that try to treat the
>relationships between objects in a robust mathematical manner, should
>be quite disturbing.

The Relational Model treat the relationships between objects in a
robust mathematical manner.

A DB relation is a relation among objects.

Regards
 Alfredo
Neo - 21 Feb 2004 00:44 GMT
> The Relational Model treat the relationships between objects in a
> robust mathematical manner.

For even a small collection of things, the number of permutations that
they could be related is huge. To deal the complexity, humans have
devised various methods. Thus far, RDM has been the most popular
because of its robustness and applicability to a large range of
applications. RDM, like all methods, imposes rules on the collection
of things. On the plus side, the rules make some
representations/operations easier and more robust (ie those where data
fits neatly in tables). On the minus side, the rules make other
representations/operations more difficult (ie trees). The deficiencies
of RDM to manage things for some range of applications is why there
are OODBs, XML, MV and XDb. XDb is a partial/experimental
implementation of TDM. TDM is a more general model than RDM. This is
why it can equally manage things arranged as either tables or trees,
has OO characteristics, no NULLs, transitive closure and fits on a
floppy to boot :)
Mikito Harakiri - 21 Feb 2004 01:05 GMT
> and fits on a floppy to boot :)

Why not call it FDb, then?
Marshall Spight - 21 Feb 2004 16:34 GMT
> On the plus side, the rules make some
> representations/operations easier and more robust (ie those where data
> fits neatly in tables). On the minus side, the rules make other
> representations/operations more difficult (ie trees).

Can you quantify this in some way? Or even just describe the
additional difficulty better?

> TDM is a more general model than RDM.

Can you quantify that? Since the RDM is fully general, it
seems a difficult task.

Marshall
Neo - 23 Feb 2004 05:41 GMT
> > TDM is a more general model than RDM.
>
> Since the RDM is fully general, it seems a difficult task.

RDM isn't fully general. If it is, why does it need NULLs?
RDM's fundamental design ensures NULLs will occur in some
applications.

IMO, one simple way to judge which data model maybe more generic, is
to count the occurance of NULLs. The model which utilizes the least
NULLs is probably more general. The presence of NULLs is usually a red
flag of some type of mismatch (chapter 20 of Date's "Intro to DB
Systems"). TDM has no NULLs. In theory, RDM can eliminate all NULLs,
but requires non-standard techniques that are impractical (ie generic
modelling, where all the data is in one table with one column). In
TDM, the techinque to model data that is "rectangularish" or otherwise
is the same and doesn't result in NULLs.

Another more significant way to judge which data model is more
generic, is to analyze degree of closure over basic operations
(intersection, union, negation). Under RDM, closure requires meeting
rather strict criteria (chapter 6) in comparision to criteria for
closure in TDM. NULLs hinder closure which in turn hinders recursion.

According to Date's 6th Ed, "missing information is not fully
understood", "no fully satisfactory solution is known", "incorporation
into model is premature" but "Codd now regards NULLs as an integral
part of RDM".

> Can you quantify that?

If we model various data examples, we should find less NULLs with TDM
than with RDM. For example, modelling 10 persons, each with different
properties. Or the following problem:

Allow user to create any hierarchy of things.
Each thing in the hierarchy can be of different type/class.
Each thing can have 1 to many parents in the hierarchy.
For all possible combinations of 2 children in the hierarchy
find the closest ancestor.

A solution to the above using TDM is shown at
www.xdb1.com/Example/Ex075.asp
Note, that although the example shows a hierarchy of similar things
and each thing has exactly 2 parents, the solution works with
hierarchy of different kinds of things with different number of
parents.

If someone can show a solution for the above problem using RDM, the
genericness of the models will become clearer.
Eric Kaun - 23 Feb 2004 12:20 GMT
> > > TDM is a more general model than RDM.
> >
> > Since the RDM is fully general, it seems a difficult task.
>
> RDM isn't fully general. If it is, why does it need NULLs?

It is fully general, and it doesn't need nulls.

> RDM's fundamental design ensures NULLs will occur in some
> applications.

No, it doesn't. Which applications are you talking about?

> IMO, one simple way to judge which data model maybe more generic, is
> to count the occurance of NULLs. The model which utilizes the least
> NULLs is probably more general.

Not sure whether this has any relevance, but certainly the relational
community agrees that nulls are a bad idea.

> The presence of NULLs is usually a red
> flag of some type of mismatch (chapter 20 of Date's "Intro to DB
> Systems"). TDM has no NULLs. In theory, RDM can eliminate all NULLs,
> but requires non-standard techniques that are impractical (ie generic
> modelling, where all the data is in one table with one column).

So in what way does TDM eliminate nulls in "standard techniques", whatever
that means?

> Another more significant way to judge which data model is more
> generic, is to analyze degree of closure over basic operations
> (intersection, union, negation). Under RDM, closure requires meeting
> rather strict criteria (chapter 6) in comparision to criteria for
> closure in TDM.

What strict criteria? I just finished reading the 8th edition of his book,
and have no idea what you mean. What are the "criteria for closure" in TDM?
I simply thought closure was a result of operations over type T yielding
values of type T.

> NULLs hinder closure which in turn hinders recursion.

Agreed, but relational doesn't allow nulls.

> According to Date's 6th Ed, "missing information is not fully
> understood", "no fully satisfactory solution is known", "incorporation
> into model is premature" but "Codd now regards NULLs as an integral
> part of RDM".

Codd was wrong in that statement, and many relational proponents support the
non-use of nulls.

> If we model various data examples, we should find less NULLs with TDM
> than with RDM. For example, modelling 10 persons, each with different
> properties. Or the following problem:

Each with different properties? You're simply talking about an attribute of
type PERSON, and the existence of multiple subtypes of PERSON. What does TDM
allow you to do with those differing properties?

> Allow user to create any hierarchy of things.
> Each thing in the hierarchy can be of different type/class.
> Each thing can have 1 to many parents in the hierarchy.
> For all possible combinations of 2 children in the hierarchy
> find the closest ancestor.

> A solution to the above using TDM is shown at
> www.xdb1.com/Example/Ex075.asp

It's impossible to tell from that page what's going on.

> Note, that although the example shows a hierarchy of similar things
> and each thing has exactly 2 parents, the solution works with
[quoted text clipped - 3 lines]
> If someone can show a solution for the above problem using RDM, the
> genericness of the models will become clearer.

This seems obvious - I don't have time right now (at least until I see a
real explanation of the XDb "example"), but a Thing relation and a
ThingParent relation would allow any number of parents as well. Adequate
domain support would enable the Thing relation to hold any type you like,
including subtypes - and the type could just be ANYTHING. Finding all
possible combinations of 2 children is ludicrously simple.

- erk
Neo - 23 Feb 2004 22:36 GMT
> > RDM isn't fully general. If it is, why does it need NULLs?
>
> It is fully general, and it doesn't need nulls.

According to Date "Codd now regards NULLs as an integral part of
relational model". Codd is correct that NULLs are required in RDM,
unless all your data fits neatly in rectangular tables which is not
the general case, or unless one resorts to unpractical methods such as
generic modelling (ie all data in one table with one column).
Eric Kaun - 24 Feb 2004 13:11 GMT
> > > RDM isn't fully general. If it is, why does it need NULLs?
> >
[quoted text clipped - 5 lines]
> the general case, or unless one resorts to unpractical methods such as
> generic modelling (ie all data in one table with one column).

Do you make some distinction between relational and RDM, and if so, what is
RDM?

Tables are not relations, and neither of them are "rectangular." You can
display, for example, a 4-dimensional tessaract (hypercube) as a table if
you like - that doesn't mean it's rectangular.

Generic modeling is a contradiction in terms - all data in one table with
one column models nothing, unless you're discussing meta-modeling of some
sort.

The general case is for data to fit into relations - if you can't do that,
you're being exceptionally sloppy with what you're saying (and failing to
say) about your company's data. You can be sufficiently general through the
attribute types (and subtypes!) that your relations are defined over.

- Eric
Christopher Browne - 24 Feb 2004 19:08 GMT
Oops! "Eric Kaun" <ekaun@yahoo.com> was seen spray-painting on a wall:
> Tables are not relations, and neither of them are "rectangular." You can
> display, for example, a 4-dimensional tessaract (hypercube) as a table if
> you like - that doesn't mean it's rectangular.

You have to be prepared to forgive people at least a little for making
this mistaken assumption.

After all:

- xBase was often described as a "relational database" despite the
  fact that it certainly wasn't;

- I haven't seen much evidence of commercial SQL system vendors
  having produced systems that are convenient to use with data
  that isn't shaped pretty blindly like a "table."

Perhaps a relational data representation _should_ be analagous to Lisp
structures or Prolog facts, and therefore be able to be of pretty much
any shape.  But in the absence of conspicuous implementations of such,
it shouldn't be surprising for people to make the "table" mistake...
Signature

wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','ntlug.org').
http://www3.sympatico.ca/cbbrowne/multiplexor.html

:FATAL ERROR -- YOU ARE OUT OF VECTOR SPACE
Eric Kaun - 24 Feb 2004 19:05 GMT
> Oops! "Eric Kaun" <ekaun@yahoo.com> was seen spray-painting on a wall:
> > Tables are not relations, and neither of them are "rectangular." You can
[quoted text clipped - 18 lines]
> it shouldn't be surprising for people to make the "table" mistake...
> :FATAL ERROR -- YOU ARE OUT OF VECTOR SPACE

True enough - it's just so commonly used as a slam against relational, as if
it's not "multi-dimensional" enough, that it disturbs me when I see it.
Somehow the fact that reality is messy bleeds into assumptions that our code
and/or data have to be messy too, which is just giving up (and professional
malpractice besides).

- Eric
Joe \ - 24 Feb 2004 22:06 GMT
> > Perhaps a relational data representation _should_ be analagous to Lisp
> > structures or Prolog facts, and therefore be able to be of pretty much
[quoted text clipped - 10 lines]
> and/or data have to be messy too, which is just giving up (and professional
> malpractice besides).

If relations are necessarily 2D rectangular tables, then objects
are nothing but mind-numbingly, insanely baroque 1D bit vectors.
Therefore, even a relation containing objects is still a 2D table.

--
Joe Foster <mailto:jlfoster%40znet.com>  DC8s in Spaace: <http://www.xenu.net/>
WARNING: I cannot be held responsible for the above        They're   coming  to
because  my cats have  apparently  learned to type.        take me away, ha ha!
Neo - 25 Feb 2004 04:33 GMT
> True enough - it's just so commonly used as a slam against relational, as if
> it's not "multi-dimensional" enough, that it disturbs me when I see it.
> Somehow the fact that reality is messy bleeds into assumptions that our code
> and/or data have to be messy too, which is just giving up (and professional
> malpractice besides).

Why not prove those people wrong by providing a clean (NULL-less)
solution to the example shown at www.xdb1.com/Example/Ex076.asp ?
Eric Kaun - 26 Feb 2004 18:09 GMT
> > True enough - it's just so commonly used as a slam against relational, as if
> > it's not "multi-dimensional" enough, that it disturbs me when I see it.
[quoted text clipped - 4 lines]
> Why not prove those people wrong by providing a clean (NULL-less)
> solution to the example shown at www.xdb1.com/Example/Ex076.asp ?

Prove what wrong? Exactly what problem are you trying to solve? I get the
silly idea that a set of relations THING, INSTANCE, and RELATIONSHIP with
some fairly generic attributes would do the trick. But again: you've said
nothing about the problem. That page says nothing about the problem or the
domain. It's bizarre, to say the least - have you come across a use for this
in real business problems?

At best, it seems like some sort of thought-sketchpad.
Bob Badour - 26 Feb 2004 18:13 GMT
> > > True enough - it's just so commonly used as a slam against relational,
> as if
[quoted text clipped - 16 lines]
>
> At best, it seems like some sort of thought-sketchpad.

Eric, you are wasting your time. Neo combines stupidity and ignorance with
free association. The most polite thing anyone can say about Neo is he is
random.
Neo - 27 Feb 2004 00:37 GMT
> > > so commonly used as a slam against relational, as if
> > > it's not "multi-dimensional" enough, that it disturbs me when I see it.
[quoted text clipped - 3 lines]
>
> Prove what wrong?

Prove to people like me that RDM is "multi-dimensional" enough.

> Exactly what problem are you trying to solve?

I already have a solution to a specific problem (Ex076) which I
believe begins to exceed RDM's scope. I am asking you provide an
equivalent (normalized, NULL-less) solution so we can determine if RDM
is "multi-dimensional" enough.

Some persons, like Bob and Alfredo, believe there is only one correct
model for representing things: RDM. I contend that other models are
possible. Each model provides different advantages and disadvantages.
I believe all models (RDM, TDM, etc) are subsets of relational
algebra. I also contend that TDM is closer to relational algebra than
RDM. A more general model(TDM) provides a more complex solution to a
problem that is within the scope of a more specific model(RDM). The
more general model(TDM) gains advantage as the complexity of problems
increases. You and I disagreed on the veracity of the above. I believe
Ex076 begins to exceed RMD's scope. I asked you to represent the same
data (without NULLs and normalized) as shown in Ex076 and generate the
same report (common ancestors). By comparing the solutions, I contend
one will conclude that TDM is more general (but not necessarily the
best for most applications). As of yet, no one has provided an
equivalent representation of the data with RDM. Will you be the first?
Neo - 27 Feb 2004 01:07 GMT
> Exactly what problem are you trying to solve? I get the
> silly idea that a set of relations THING, INSTANCE, and RELATIONSHIP with
> some fairly generic attributes would do the trick.

Yes, your idea of resorting to generic modelling is correct. By using
a few generic tables it is possible, but soon becomes impractical. For
some problems (not very common), you would ultimately have to resort
to just one two-columned table. RDM's methods and related tools don't
work very well with just a single two-columned table. The fact that no
on has presented an equivalent to the simple problem (Ex076) using RDM
may be an indicator of how impractical it is.
Mikito Harakiri - 27 Feb 2004 01:30 GMT
> The fact that no
> on has presented an equivalent to the simple problem (Ex076) using RDM
> may be an indicator of how impractical it is.

How is your problem is different from finding nearest common ancestor in BOM
hierarchy? If this hint is not detailed enough for you, may I suggest for
you to learn how to represent tree in the relational database instead of
broadcasting nonsence about generic table with 2 columns?
Bob Badour - 27 Feb 2004 01:41 GMT
> > The fact that no
> > on has presented an equivalent to the simple problem (Ex076) using RDM
[quoted text clipped - 4 lines]
> you to learn how to represent tree in the relational database instead of
> broadcasting nonsence about generic table with 2 columns?

You may suggest all you want, but Neo has proved impervious to suggestion of
any kind.
Mikito Harakiri - 27 Feb 2004 01:49 GMT
> You may suggest all you want, but Neo has proved impervious to suggestion of
> any kind.

He is quoting Date. That's some progress. Perhaps we can expect him to stop
shameless product promotion?
Bob Badour - 27 Feb 2004 01:53 GMT
> > You may suggest all you want, but Neo has proved impervious to suggestion
> of
> > any kind.
>
> He is quoting Date. That's some progress. Perhaps we can expect him to stop
> shameless product promotion?

You might as well expect the leopard to change his spots.
Neo - 27 Feb 2004 01:45 GMT
> That page says nothing about the problem or the domain.
> At best, it seems like some sort of thought-sketchpad.

"This example represents things in a command hierarchy and generates a
report indicating the closest common commander between all possible
pairs. For example, john and mary obey the army. Army is their closest
common commander. The first figure shows the various types of things
and their instances. The second figure shows the command hierarchy
starting with god."

> have you come across a use for this in real business problems?

A related problem was a SQL-Server based system that forcasted the
utility requirements for Intel's semiconductor production facilities
based on various types of equipment throughout their plant fed by a
networks of pipes and conduits (hierarchies with different kinds of
things).
Bob Badour - 24 Feb 2004 21:34 GMT
> Oops! "Eric Kaun" <ekaun@yahoo.com> was seen spray-painting on a wall:
> > Tables are not relations, and neither of them are "rectangular." You can
[quoted text clipped - 3 lines]
> You have to be prepared to forgive people at least a little for making
> this mistaken assumption.

This has been explained so many times to Neo that no forgiveness is
warranted.
Neo - 25 Feb 2004 04:22 GMT
> > Tables are not relations, and neither of them are "rectangular." You can
> > display, for example, a 4-dimensional tessaract (hypercube) as a table if
> > you like - that doesn't mean it's rectangular.
>
> You have to be prepared to forgive people at least a little for making
> this mistaken assumption.

I realize that things can be presented in mulitple formats regardless
of how it is actually stored. For example XDb1's GUI presents things
in tables, trees or sentences but the things themselve aren't stored
that way. What I am saying is that the internal format of RDM is
"rectangularish" and while it can also display things as tables, trees
and sentences, it is clumsier at some things like trees compared to
tables. TDM/XDb1's internal structure is more general and therefore
more neutral to either. In fact RDM's "rectangularish" internal
structure (aka relation) is the cause of NULLs. If one doubts that
RDM's "rectangularish" relation makes it clumsy at trees, one might
try to replicate the example shown at www.xdb1.com/Example/Ex076.asp

>  - xBase was often described as a "relational database" despite the
>    fact that it certainly wasn't;

All information systems are relational to some degree.
RDM is simply one of the closest to pure relational algebra.

>  - I haven't seen much evidence of commercial SQL system vendors
>    having produced systems that are convenient to use with data
>    that isn't shaped pretty blindly like a "table."

While XDb1 is only experimental, it does provide a way to manage
things thru tables, trees and sentences.

> Perhaps a relational data representation _should_ be analagous to Lisp
> structures or Prolog facts, and therefore be able to be of pretty much
> any shape.  But in the absence of conspicuous implementations of such,
> it shouldn't be surprising for people to make the "table" mistake...

XDb1 allows things of variable shapes to be entered via table, tree
and english-like sentences. For example, the following sentences
creates the equivalent of a relation with 2 tuples.

person isa thing.
john isa person.
mary isa person.

For more details, see www.xdb1.com/NLI/Default.asp and
www.xdb1.com/Example/Ex001.asp
Eric Kaun - 26 Feb 2004 18:13 GMT
> What I am saying is that the internal format of RDM is
> "rectangularish" and while it can also display things as tables, trees
> and sentences, it is clumsier at some things like trees compared to
> tables.

RDM has no internal format. That's physical implementation. Read about
TransRelational (subtype of Tarrin transforms) sometime to see just how
different (and useful!) a clever physical scheme can be.

> TDM/XDb1's internal structure is more general and therefore
> more neutral to either.

More general? Not really. How? What is the structure, anyway?

> While XDb1 is only experimental, it does provide a way to manage
> things thru tables, trees and sentences.

"Managing things" isn't the end goal of relational. Deductive correctness is
(among other things). You can "manage", whatever that means, through an
arbitrary number of mechanisms of whatever shape you like.

> XDb1 allows things of variable shapes to be entered via table, tree
> and english-like sentences. For example, the following sentences
[quoted text clipped - 3 lines]
> john isa person.
> mary isa person.

So how is this better than a relation with 2 tuples?
Neo - 27 Feb 2004 02:16 GMT
> RDM has no internal format.

Per Date, RDM's fundamental representation unit, a relation, consists
of 1) a heading of a fixed set of attributes 2) a body consisting of
tuples (rows), each containing of values related to the heading. While
the internal format doesn't prevent RDM (or any other model) from
representing tables, trees, sentences, 4-D, etc, the internal format
does have an impact on which structure are easier (tables) and which
are more difficult (trees, Ex076). And in RDM's case, the internal
format make it impossible to guarantee no NULLs will ever occur,
unlike TDM which can make that guarantee.

> > XDb1 allows things of variable shapes to be entered via table, tree
> > and english-like sentences. For example, the following sentences
[quoted text clipped - 4 lines]
>
> So how is this better than a relation with 2 tuples?

Not much difference, if there are only a few (ultimately 1) tables in
the db (ie T_Thing, T_SubVerbObj), however such an implementation is
impractical in RDM.
Neo - 24 Feb 2004 20:30 GMT
> Do you make some distinction between relational and RDM,
> and if so, what is RDM?

RDM (data expressed as collections of tuples contrained by the header,
etc) is less general than relational [ie ((((ab)cd)e)f)(gh)]. Any time
you add rules to a system, it becomes less general. RDM adds
"rectangularish" rules to relational.

> Tables are not relations, and neither of them are "rectangular."

A relation (table) by definition is "rectangularish". Per Date, a
relation consists of 1) a heading of a fixed set of attributes 2) a
body consisting of tuples (rows), each containing of values related to
the heading.

You don't see anything "rectangularish" in the above ???

Because the fundamental building block in RDM is "rectangularish",
dealing with missing data, trees, and complex structure is more
difficult and sometimes impractical, although not impossible, since
you have to use more smaller blocks to get the desired shape or
eliminate NULLs.

> You can display, for example, a 4-dimensional tessaract (hypercube)
> as a table if you like - that doesn't mean it's rectangular.

It is not that you cant represent 4-th dimensional, trees and complex
data structures using the "rectangularish" building block provided by
RDM, it's just more difficult and sometimes impractical.

> The general case is for data to fit into relations

Reality doesn't always provide data that fits "rectangularish"
relations.
And when it doesn't, RDM needs NULLs, as Codd has correctly
recognized.

> You can be sufficiently general through the attribute types
> (and subtypes!) that your relations are defined over.

Because the basic representational block in RDM is "rectangularish",
to represent 10 persons each with different properties without NULLs,
one has to resort to additional tables. The need to "subtype
attributes" in this case is artifical and brought on by NULLs, which
RDM creates, not reality.
Eric Kaun - 24 Feb 2004 21:49 GMT
[...]
> A relation (table) by definition is "rectangularish". Per Date, a
> relation consists of 1) a heading of a fixed set of attributes 2) a
> body consisting of tuples (rows), each containing of values related to
> the heading.
>
> You don't see anything "rectangularish" in the above ???

No - and seeing has nothing to do with it. The visual representation, for
display on screen and paper, is "rectangularish." But it's certainly no more
"rectangularish" than the XDb example - that can easily be represented as
relations, and hence as "rectangles."

I guess I don't object to the characterization - just to the implication
that "rectangles" are inferior to... something else.

> Because the fundamental building block in RDM is "rectangularish",
> dealing with missing data, trees, and complex structure is more
> difficult and sometimes impractical, although not impossible, since
> you have to use more smaller blocks to get the desired shape or
> eliminate NULLs.

I don't agree with its impracticality. When missing values are involved you
find yourself saying things you never intended, and not saying things you
need to say. The meaning of your relations is undermined.

> It is not that you cant represent 4-th dimensional, trees and complex
> data structures using the "rectangularish" building block provided by
> RDM, it's just more difficult and sometimes impractical.

I disagree that it's more difficult. Trees are easy. What other "complex
data structures" does XDb handle better?

> Reality doesn't always provide data that fits "rectangularish"
> relations.
> And when it doesn't, RDM needs NULLs, as Codd has correctly
> recognized.

Codd believed that incorrectly, as Date implies. You can find many writings
on the web concerning the danger of missing values. Look on dbdebunk.com for
starters.

> Because the basic representational block in RDM is "rectangularish",
> to represent 10 persons each with different properties without NULLs,
> one has to resort to additional tables. The need to "subtype
> attributes" in this case is artifical and brought on by NULLs, which
> RDM creates, not reality.

Reality is what it is; we're trying to model and reason about it. That says
precisely nothing about the "need" to somehow correlate the "shape" of our
data with the "shape" of reality. My fingers are cringing just typing this,
it's so silly. We shape things according to our need to reason about them.
Whatever the "shapes" of data in XDb, I have yet to see a coherent
explanation, or to understand how its "shape" makes it easier to reason
about.

Nulls are not created, for example, if you have domains with special values
(e.g. UNKNOWN). Its semantics are then clear, and this avoids many issues
with nulls. With proper domain support, this is trivial.

- erk
Neo - 25 Feb 2004 06:23 GMT
> > It is not that you cant represent 4-th dimensional, trees and complex
> > data structures using the "rectangularish" building block provided by
> > RDM, it's just more difficult and sometimes impractical.
>
> I disagree that it's more difficult. Trees are easy.

Will you implement an equivalent NULL-less solution to the example
posted at www.xdb1.com/Example/Ex076.asp so that we can establish
this?
Marshall Spight - 25 Feb 2004 08:22 GMT
> > > It is not that you cant represent 4-th dimensional, trees and complex
> > > data structures using the "rectangularish" building block provided by
[quoted text clipped - 5 lines]
> posted at www.xdb1.com/Example/Ex076.asp so that we can establish
> this?

Here's my first cut. I reserve the right to change my design later. No
nulls were used. I have to admit your test data made me feel a bit silly.

create table EX76 (subject varchar(80), relator varchar(80), object varchar(80));
insert into EX76 values ('obeys', 'isa', 'relator');
insert into EX76 values ('god', 'isa', 'thing');
insert into EX76 values ('god', 'equals', 'god');
insert into EX76 values ('it', 'is', 'obeys');
insert into EX76 values ('force', 'isa', 'thing');
insert into EX76 values ('army', 'isa', 'force');
insert into EX76 values ('church', 'isa', 'thing');
insert into EX76 values ('trinity', 'isa', 'church');
insert into EX76 values ('person', 'isa', 'thing');
insert into EX76 values ('john', 'isa', 'person');
insert into EX76 values ('mary', 'isa', 'person');
insert into EX76 values ('luke', 'isa', 'person');
insert into EX76 values ('age', 'isa', 'thing');
insert into EX76 values ('35', 'isa', 'age');
insert into EX76 values ('john', 'is', '35');
insert into EX76 values ('weight', 'isa', 'thing');
insert into EX76 values ('130', 'isa', 'weight');
insert into EX76 values ('mary', 'is', '130');
insert into EX76 values ('color', 'isa', 'thing');
insert into EX76 values ('red', 'isa', 'color');
insert into EX76 values ('luke', 'is', 'red');
insert into EX76 values ('dog', 'isa', 'thing');
insert into EX76 values ('fido', 'isa', 'dog');
insert into EX76 values ('computer', 'isa', 'thing');
insert into EX76 values ('laptop1', 'isa', 'computer');
insert into EX76 values ('army', 'obeys', 'god');
insert into EX76 values ('trinity', 'obeys', 'god');
insert into EX76 values ('john', 'obeys', 'army');
insert into EX76 values ('mary', 'obeys', 'army');
insert into EX76 values ('mary', 'obeys', 'trinity');
insert into EX76 values ('luke', 'obeys', 'trinity');
insert into EX76 values ('laptop1', 'obeys', 'john');
insert into EX76 values ('laptop1', 'obeys', 'mary');
insert into EX76 values ('fido', 'obeys', 'john');
insert into EX76 values ('fido', 'obeys', 'mary');
insert into EX76 values ('fido', 'obeys', 'luke');

Marshall
Mikito Harakiri - 25 Feb 2004 17:18 GMT
> > > > It is not that you cant represent 4-th dimensional, trees and complex
> > > > data structures using the "rectangularish" building block provided by
[quoted text clipped - 46 lines]
> insert into EX76 values ('fido', 'obeys', 'mary');
> insert into EX76 values ('fido', 'obeys', 'luke');

And now please write a query that returns an aggregate age of all persons.
In this schema with SQL, or with Neo's school science fair project.
Neo - 26 Feb 2004 01:09 GMT
> And now please write a query that returns an aggregate age of all persons.
> In this schema with SQL, or with Neo's school science fair project.

// Psuedo code to calc age of all unique persons under selected node
// Step thru unique descendant's of God
  int ageTotal = 0;
  int* pDesc_a[kMAX_TREE_DEPTH] = {pGod};
  while (int i = T_Relatives(pDesc_a, kCREATURE, kUNIQUE)){
    int* pThing = pDesc_a[i];
    if (IsAncestorOf(pPersonCls, pThing)){
      int* pAge = T_Property_get(pThing, pAgeCls);
      if (pAge){
        sAge = T_Symbol_get(pAge);
        ageTotal = ageTotal + ConvertSymbolToInteger(sAge);
      }
    }
  }

XDb1 has transitive closure.
For actual code to a related problem, see www.xdb1.com/Example/Ex075b.asp
Mikito Harakiri - 26 Feb 2004 01:38 GMT
<FORTRAN code snipped>
> XDb1 has transitive closure.

XDb1 has "amaterish" transitive closure
Neo - 26 Feb 2004 16:02 GMT
> > XDb1 has transitive closure.
>
> XDb1 has "amaterish" transitive closure

Why not show how amaterish it is by providing an alternate solution?
Mikito Harakiri - 26 Feb 2004 17:58 GMT
> > > XDb1 has transitive closure.
> >
> > XDb1 has "amaterish" transitive closure
>
> Why not show how amaterish it is by providing an alternate solution?

FYI "closest common commander" is called "nearest common ancestor" in the
regular  literature.

I'm not sure what exactly your challenge is, but finding nearest common
ancestor in a tree is trivial:
1. Select path to the root from node A.
2. Select path to the root from B.
3. Intersect 1 and 2.
4. Find the node most distant from the root in the result.
Neo - 27 Feb 2004 02:39 GMT
> FYI "closest common commander" is called "nearest common ancestor" in the
> regular  literature.

Ancestor is a generic term. For some types of relationships, a more
specific term can be used. Since the relator between things in the
example hierarchy is "obeys" (ie fido obeys luke, luke obeys trinity,
trinity obeys god) the ancestor could be described more specifically
as commander. If one runs the report, it actually says "common
ancestor", since XDb1 has not been programmed to accept or use a more
specific term for ancestor in a specific hierarchy.
Neo - 27 Feb 2004 03:14 GMT
> > > XDb1 has "amaterish" transitive closure
> > Why not show how amaterish it is by providing an alternate solution?
> but finding nearest common ancestor in a tree is trivial:

I can't tell by the provided 4 steps, if the alternate solution is
more or less "amaterish".

> I'm not sure what exactly your challenge is:

1) Model the data shown at www.xdb1.com/Example/Ex076.asp
The data is described quite accurately by sentences like "john isa
person".
The two figures show views of the final data as trees. There isn't
much table-like data, so nothing was displayed in the grid.
1a) You don't have to model the relator "obeys"
1b) All tables must be NULL-less (XDb1's data is NULL-less).
1c) Data must be normalized (Don't normalize down to atomic symbols as
XDb1 does, but normalize things like john, mary, fido, etc).

2) Create a "Nearest Common Ancestor Report" for things in the command
hierarchy. I have copied the report below from the webpage. The order
of things can be different.

Common Ancestor Report for 'god'
ThingX    ThingY    CmnAnc    Dist
army    john    army    1
army    laptop1    army    2
army    fido    army    2
army    mary    army    1
army    trinity    god    2
army    luke    god    3
john    laptop1    john    1
john    fido    john    1
john    mary    army    2
john    trinity    god    3
john    luke    god    4
laptop1    fido    john    2
laptop1    mary    mary    1
laptop1    trinity    trinity    2
laptop1    luke    trinity    3
fido    mary    mary    1
fido    trinity    trinity    2
fido    luke    luke    1
mary    trinity    trinity    1
mary    luke    trinity    2
trinity    luke    trinity    1
Time elapsed: 15 msec

XDb1's representation/gui/code is generic enough so that someone else
could enter a completely different hierarchy consisting of different
things and yet the report would run properly. Your code/SQL should
have a similar level of genericness.
Neo - 28 Feb 2004 05:51 GMT
I just realized that I don't want you to explicitly represent the
relators "obey", "isa" or "is". They can be implied.

The following might serve as base tables to implement Ex076.

T_CmdHier
T_God
T_ArmedForce
T_Church
T_Person
T_Dog
T_Computer
T_Age
T_Weight
T_Color

Note: Many of the tables may have just one tuple in order to implement
the example.
Marshall Spight - 26 Feb 2004 03:44 GMT
> > Here's my first cut. I reserve the right to change my design later. No
> > nulls were used. I have to admit your test data made me feel a bit silly.
[quoted text clipped - 40 lines]
> And now please write a query that returns an aggregate age of all persons.
> In this schema with SQL, or with Neo's school science fair project.

Here's the results from a test run I did:

test=> select sum(to_number(object,'99')) from EX76 where object in
( select subject from EX76 where relator = 'isa' AND object = 'age');
sum
-----
 35
(1 row)

Please note that I am not advocating building schemata that look like this.
I'm just pointing out that an existing relational database can handle it.

Compare the length and clarity of the above SQL with Neo's pseudocode.

Marshall
Joe \ - 26 Feb 2004 04:10 GMT
> test=> select sum(to_number(object,'99')) from EX76 where object in
> ( select subject from EX76 where relator = 'isa' AND object = 'age');
[quoted text clipped - 7 lines]
>
> Compare the length and clarity of the above SQL with Neo's pseudocode.

Neo's silly scheme doesn't scale.  What happens if person X is 98
years old and person Y weighs 98 pounds?

--
Joe Foster <mailto:jlfoster%40znet.com>  "Regged" again? <http://www.xenu.net/>
WARNING: I cannot be held responsible for the above        They're   coming  to
because  my cats have  apparently  learned to type.        take me away, ha ha!
Neo - 26 Feb 2004 18:05 GMT
> Neo's silly scheme doesn't scale.
> What happens if person X is 98 years old and person Y weighs 98 pounds?

You are correct, the scheme doesn't indicate the quantity's units. Not
because it can't, but because the main intent of the example was to
demonstrate something else. Two way to specify quantity units are
shown below:

Method 1
--------------
qty isa thing.
98 isa qty.

unit isa thing.
pound isa unit.
year isa unit.

age isa thing.
* isa age.    (* creates an "unnamed" thing)
it isa 98.
it isa year.  (it inherents its "name", ie "98 year")

weight isa thing.
* isa weight.
* isa 98.
it isa pound.

Method 2
--------------
qty isa thing.
98 isa qty.

unit isa thing.
pound isa unit.
year isa unit.

age isa thing.
98yr isa age.
98yr is 98.
98yr is year.

weight isa thing.
98lb isa weight.
98lb is 98.
98lb is pound.
Neo - 26 Feb 2004 17:34 GMT
> > And now please write a query that returns an aggregate age of all persons.
>
> test=> select sum(to_number(object,'99')) from EX76 where object in
> ( select subject from EX76 where relator = 'isa' AND object = 'age');
>
> Compare the length and clarity of the above SQL with Neo's pseudocode.

If I use unnormalized data, the result would be similar. Can you use
normalized data and solve the problem with similar level of
genericness so that we can compare "length and clarity"?

Are you advocating unnormalized data? If not why provide unnormalized
solutions?
Marshall Spight - 27 Feb 2004 03:56 GMT
> > > And now please write a query that returns an aggregate age of all persons.
> >
[quoted text clipped - 9 lines]
> Are you advocating unnormalized data? If not why provide unnormalized
> solutions?

It's not clear to me you know what "normalized" means. Can you be
specific about what normalization rules you are referring to? In
what way is my schema not normalized?

Marshall
Neo - 28 Feb 2004 05:48 GMT
> It's not clear to me you know what "normalized" means. Can you be
> specific about what normalization rules you are referring to? In
> what way is my schema not normalized?

Normalization: The process of replacing duplicates things with a
reference to the original thing.

For example, given "john isa person" and "john obeys army", one
observes that the "john" in the second sentence is a duplicate of
"john" in the first sentence. Using the means provided by your system,
the second sentence should be stored as "->john obeys army".

Another example, given "bob" one observes that the second "b" is a
duplicate of the first "b" and therefore should be normalized as
"bo->b". I don't want you to normalized this far, even though Ex076
is.

The exact method of normalization and to what extent is practical is
dependent on the data model and its implementation.

Sorry, I just realized that I don't want you to explicitly represent
the relators "obey", "isa" or "is". They can be implied.

The following might serve as base tables to implement Ex076.

T_CmdHier
T_God
T_ArmedForce
T_Church
T_Person
T_Dog
T_Computer
T_Age
T_Weight
T_Color

Note: Many of the tables may have just one tuple in order to implement
the example.
Eric Kaun - 03 Mar 2004 13:08 GMT
> Normalization: The process of replacing duplicates things with a
> reference to the original thing.
[quoted text clipped - 3 lines]
> "john" in the first sentence. Using the means provided by your system,
> the second sentence should be stored as "->john obeys army".

You're joking, right? In what sense have you saved work, or eliminated
duplicates? You'll have 2 sentences with john as the subject.

> Another example, given "bob" one observes that the second "b" is a
> duplicate of the first "b" and therefore should be normalized as
> "bo->b".

You're joking, right? Please tell me you're joking.

> The exact method of normalization and to what extent is practical is
> dependent on the data model and its implementation.

No, it's dependent on neither of those.
Neo - 04 Mar 2004 01:14 GMT
> You're joking, right?

No, not joking. See www.xdb1.com/Basic/Symbol.asp,
www.xdb1.com/GUI/Labels.asp and www.xdb1.com/HowTo/Find.asp

> You're joking, right? Please tell me you're joking.

No, not joking. See www.xdb1.com/Basic/Symbol.asp,
www.xdb1.com/GUI/Labels.asp and www.xdb1.com/HowTo/Find.asp

> > The exact method of normalization and to what extent is practical is
> > dependent on the data model and its implementation.
>
> No, it's dependent on neither of those.

Then how do you explain that in TDM/XDb1, things are normalized down
to atomic symbols (a, b, c ...) where as a similar level of
normalization in RDM is impractical?
Dawn M. Wolthuis - 04 Mar 2004 01:22 GMT
> > You're joking, right?
>
[quoted text clipped - 14 lines]
> to atomic symbols (a, b, c ...) where as a similar level of
> normalization in RDM is impractical?

I gotta admit that I thought you were joking too, Neo.  I don't know what
you mean by normalize and why not go down to the 1's & 0's -- what does the
symbol "normalization" gain you?  Perhaps I haven't read your web sites
close enough, but at this point I'm definitely not tracking.  --dawn
Neo - 04 Mar 2004 22:47 GMT
> I don't know what you mean by normalize

In TDM/XDb1, normalize means to replace a duplicate thing with a
reference to the orginal thing. For example, given "john isa person"
and "john obeys army", one observes that the "john" in the second
sentence is a duplicate of "john" in the first sentence. The second
sentence becomes "->john obeys army". Another example, given "bob" one
observes that the second "b" is a duplicate of the first "b" and
therefore can be normalized as "bo->b". Note, in TDM/XDb1 all the
following are different things: the person named "john", the word
"john" and each symbol of the word "john"  The exact method of
normalization and to what extent is practical is dependent on the data
model and its implementation. In RDM, keys or IDs are used to
implement a reference.

One could normalize symbols in RDM as shown below, but it is not
practical.
I use "->X" syntax, instead of IDs to make it easier to following the
referential links:

T_Person
Name, Age
->john, ->28

T_CompositeSymbol
AtmSym1, AtmSym2, AtmSym3, ...
->j, ->o, ->h, ->n
->2, ->8

T_AtomicSymbol
1
2
a
b
c
...
Mikito Harakiri - 04 Mar 2004 23:08 GMT
> > I don't know what you mean by normalize
>
[quoted text clipped - 3 lines]
> sentence is a duplicate of "john" in the first sentence. The second
> sentence becomes "->john obeys army".

Likewise,

"Neo has funnyNormalizationInterpretation"
"Neo has watchedTooMuchMatrix"
"Neo has workingToBecome#1Troll"

becomes

"Neo has funnyNormalizationInterpretation"
"->Neo ->has watchedTooMuchMatrix"
"->->Neo ->->has workingToBecome#1Troll"

??

That is really mind boggling, ya know.
Neo - 05 Mar 2004 19:32 GMT
> "Neo has funnyNormalizationInterpretation"
> "Neo has watchedTooMuchMatrix"
[quoted text clipped - 5 lines]
> "->Neo ->has watchedTooMuchMatrix"
> "->->Neo ->->has workingToBecome#1Troll"

The last line should be
 "->Neo ->has workingToBecome#1Troll"
Neo - 04 Mar 2004 22:52 GMT
> and why not go down to the 1's & 0's

The issue is not which set of composite symbols are most "normalized"
(five vs 5 vs 1001) when representing a thing, but that a composite
symbol can be normalized. For example, the symbols in "1001" can be
normalized to "->1, ->0, ->0, ->1".

>what does symbol "normalization" gain you?

Hypothethically, suppose, Microsoft says, inorder to include symbols
from the neighboring galaxy we want to change a's ascii value from 97
to 1.56E03. In XDb1, this change would only occur at one place, all
other a's will automatically be correct since they reference the
original a.

Practically, normalizing symbols allows TDM/XDb1 to find things (by
their name) within the database quickly no matter where they are
located. In RDM, to implement a general solution to find a composite
symbol (ie john), one would have to search every table, every row,
every column and the algorithm needs to handle tables added in the
future.

In TDM/XDb1, because symbols are normalized, the general solution does
not require a scan of the entire db and the algorithm is unaffected by
future "tables".

Would someone like to compare the time to find things by their
composite symbols (ie john) using a general solution? By general, I
mean, it is able to find john in any field of any table including
tables not in existance during design-time.
Mike Sherrill - 06 Mar 2004 12:27 GMT
>Practically, normalizing symbols allows TDM/XDb1 to find things (by
>their name) within the database quickly no matter where they are
>located.

Hmmm.

Are you under the impression that I spend time searching all my tables
for "john", only to find "john" stored as my ZIP code?

Signature

Mike Sherrill
Information Management Systems

Neo - 06 Mar 2004 20:37 GMT
> >Practically, normalizing symbols allows TDM/XDb1 to find things (by
> >their name) within the database quickly no matter where they are
> >located.
>
> Are you under the impression that I spend time searching all my tables
> for "john", only to find "john" stored as my ZIP code?

I am under the impression that you are assuming one already knows
which tables exist and which table to search for john and zip code.

In XDb1, the querying "X's Y is?" (ie "john smith's zip code is?")
finds the appropriate answer even if john was "tabled" in bi-ped,
human, animal or all the previous. XDb1 also finds things simply by
querying "X?". X could be the name of a person, car, book, state,
city, SS#, part#, etc.

When person1 asks person2 about thing3, person1 doesn't typically
specify what "table" to search in, yet person2 is able to retrieve it
from his brain. By normalizing down to atomic symbols, XDb1 attempts
to emulate that type of capability.
Marshall Spight - 08 Mar 2004 08:15 GMT
> I am under the impression that you are assuming one already knows
> which tables exist and which table to search for john and zip code.

Ah, there's that meme: the "I don't want to have to know a
schema" meme.

But there's always a schema, and you always have to know it.
The format of the schema might change, but there is always
a schema.

Marshall
Neo - 08 Mar 2004 18:39 GMT
> > I am under the impression that you are assuming one already knows
> > which tables exist and which table to search for john and zip code.
>
> But there's always a schema, and you always have to know it.
> The format of the schema might change, but there is always a schema.

We agree that there is always a schema and ultimately the system, not
necessarily the user, needs to know it at run-time. With typical RDM
implementations, this is more difficult to accomplish in some
situations because in order to access data within a table, one
generally has to know the name of the table ahead of time. With
TDM/XDb1, while data is also classified/typed, an alternate method of
accessing the data is simply via the symbols that compose the data.
Once the data is accessed, TDM/XDb1 allows the related
class(es)/type(es), attributes, etc to be determined.
Mikito Harakiri - 08 Mar 2004 18:48 GMT
> With TDM/XDb1...

Apparently you changed the name to "XDb1". Neo, I was kidding about the
lawsuit. Until competitors take your stuff seriously you have little to
worry about. And everybody on this group would guarantee you that would
never happen.
Marshall Spight - 08 Mar 2004 20:57 GMT
> > > I am under the impression that you are assuming one already knows
> > > which tables exist and which table to search for john and zip code.
[quoted text clipped - 4 lines]
> We agree that there is always a schema and ultimately the system, not
> necessarily the user, needs to know it at run-time.

The user has to know the schema, too. The schema is what the
data *means.* If the user doesn't know what the data means,
the user can't use the system.

The system also has to know the schema in order to enforce
integrity.

Marshall
Neo - 09 Mar 2004 17:14 GMT
> > We agree that there is always a schema and ultimately the system, not
> > necessarily the user, needs to know it at run-time.
>
> The schema is what the data *means.*

First your definition does not match others. Per dictionary, a schema
is an underlying organization pattern or scheme. Looking thru several
db books, I could not find a stardardized definition (like that for
relation) of what a schema is. Date's book don't mention it in the
index. Another book, defines several types of schemas, but never
schema itself. A third book, distinguishes database schema (aka
meta-data) as the description of the database as opposed to the data
itself. They say schema is specified during design phase and is not
expected to change frequently. IMO, the last definition seems the most
appropriate with respect to RDM and contrary to yours given above.

> The user has to know the schema, too. If the user doesn't know
> what the data means, the user can't use the system.

Using the last definition above, the level to which the user "has to
know the schema" is dependent on what he is trying to do, the db's
design, the code which interfaces user to db, etc. For somethings,
user may not need to know anything about a db's schema. At the other
extreme, user may need to know nearly every detail of a db's schema.

In TDM/XDb1, there is no meta-data about the data in the db. Data
added to the db doesn't have to conform to any design-time "schema"
but the added data itself defines the current "schema".
Marshall Spight - 10 Mar 2004 04:23 GMT
> > > We agree that there is always a schema and ultimately the system, not
> > > necessarily the user, needs to know it at run-time.
[quoted text clipped - 3 lines]
> First your definition does not match others. Per dictionary, a schema
> is an underlying organization pattern or scheme.

Yes. It is this underlying organizational pattern or schema that
determines what the data means.

> Looking thru several
> db books, I could not find a standardized definition (like that for
> relation) of what a schema is.

That's a shame, don't you think? It seems we ought to at least
be able to have a common vocabulary, even if we are disagreeing.
It's impossible to have a meaningful conversation in the absence
of common definitions.

Come to think of it, this paragraph is self-referential. A definition
is the semantics for a word; a schema is the semantics for a database.

> Date's book don't mention it in the
> index. Another book, defines several types of schemas, but never
> schema itself. A third book, distinguishes database schema (aka
> meta-data) as the description of the database as opposed to the data
> itself.

That's the best one so far.

> They say schema is specified during design phase and is not
> expected to change frequently. IMO, the last definition seems the most
> appropriate with respect to RDM and contrary to yours given above.

(That's less a definition and more of a partial functional description.)
Nothing contradictory that I can see.

> > The user has to know the schema, too. If the user doesn't know
> > what the data means, the user can't use the system.
[quoted text clipped - 4 lines]
> user may not need to know anything about a db's schema. At the other
> extreme, user may need to know nearly every detail of a db's schema.

These two statements:

1) > In TDM/XDb1, there is no meta-data about the data in the db.

2) > Data added to the db doesn't have to conform to any design-time
> "schema" but the added data itself defines the current "schema".

contradict each other. I believe the second one, but I don't believe
the first one.

Marshall
Neo - 10 Mar 2004 19:43 GMT
> > These two statements contradict each other:
> 1) In TDM/XDb1, there is no meta-data about the data in the db.
> 2) Data added to the db doesn't have to conform to any design-time
>    "schema" but the added data itself defines the current "schema".

In TDM/XDb1, there is no meta-data about the data as in RDM.
For instance in RDM sytem/hidden table(s) indicate:
a) what tables exist in db.
b) what attributes exist in each table.
d) the type of each attriblute.
c) which key of tableX is related to which key of tableY.
There is meta-data like the above in TDM/XDb1.

In TDM/XDb1 there are no stored "schemas", but a "schema" of any level
of generalization can be derived from the current data. Unlike RDM,
TDM only assumes the following about future data: new things will be
related to existing things.
Neo - 10 Mar 2004 19:51 GMT
> > > The schema is what the data *means.*
> >
[quoted text clipped - 3 lines]
> Yes. It is this underlying organizational pattern or schema that
> determines what the data means.

In TDM/XDb1, it is nearly the reverse. No schema determines what data
means. Instead, the current data can be used to derive schemas of any
level of generalization.
Chris Hoess - 04 Mar 2004 05:03 GMT
>> You're joking, right? Please tell me you're joking.
>
> No, not joking. See www.xdb1.com/Basic/Symbol.asp,
> www.xdb1.com/GUI/Labels.asp and www.xdb1.com/HowTo/Find.asp

I can immediately see two problems with this, one big and one small. Dawn
has already pointed out the "small" one, which is that your concept of
atomicity is not well grounded. I surmise, based on the limited information
you provided, that you consider, say, a single ASCII character to be
"atomic". However, since you can represent any character with a bit
sequence, there's no reason your supposedly "atomic" symbols can't be broken
down into bits, leaving you with the atoms 0 and 1 only (or FALSE and TRUE
if you prefer).

The big problem is that your description of normalization as "The process of
replacing duplicates [sic] things with a reference to the original thing"
really isn't quite correct. ("The difference between the almost right word
and the right word is really a large matter -- 'tis the difference between
the lightning-bug and the lightning."--Mark Twain) Normalization removes
relationships inadvertantly implied between pieces of data by the design of
the database. It doesn't create references, and it doesn't decompose data
types.

Since you seem to have disagreed with Marshall before about this, perhaps
you could provide an example of the "update anomaly" that decomposing "bob"
into its individual characters is supposed to prevent?

>> > The exact method of normalization and to what extent is practical is
>> > dependent on the data model and its implementation.
[quoted text clipped - 4 lines]
> to atomic symbols (a, b, c ...) where as a similar level of
> normalization in RDM is impractical?

Because this process can only be called "normalization" through a vigorous
application of the imagination. (And speaking of impracticality, what is it
that led you to declare that a database consisting of many "two-columned"
tables is impractical?)

Signature

Chris Hoess

Neo - 04 Mar 2004 20:20 GMT
> I can immediately see two problems with this, one big and one small. Dawn
> has already pointed out the "small" one, which is that your concept of
> atomicity is not well grounded.

It may not be well grounded in your understanding or within RDM, but
this not the case in TDM/XDb1.

> I surmise, based on the limited information you provided, that you consider, > say, a single ASCII character to be "atomic".

Yes. Also see Susanne Langer's book "Symbolic Logic".

> However, since you can represent any character with a bit sequence...

The bit sequence (ie 10101) can only be expressed with symbols in this
case a combination of atomic symbols 0's and 1's. What symbols and
combination of symbols means is a different issue.
Neo - 04 Mar 2004 20:41 GMT
> The big problem is that your description of normalization as "The process of
> replacing duplicates [sic] things with a reference to the original thing"
> really isn't quite correct.

Within the context of TDM/XDb1, it is correct. Could you prove
otherwise?

> Normalization removes relationships inadvertantly implied between pieces
> of data by the design of the database. It doesn't create references,
> and it doesn't decompose data types.

Such may or may not be the case in your understanding or RDM. In
TDM/XDb1, normalization is the process of replacing duplicate things
with references to the original thing. It is similar in RDM. Suppose
you start with

T_ItemColor
MyCar Blue
YourCar Blue

To normalize

T_ItemColor
MyCar ->Blue
YourCar ->Blue

T_Color
Blue

In effect, one has replaced duplicate things with a reference to the
orignal thing. Since all things in a row need to be of the same type,
this required us to move Blue to a new table.  In RDM, the mechanism
of a reference is called a key or ID.

> Since you seem to have disagreed with Marshall before about this,

I don't believe Marshall disagreed with TDM/XDb1's definition of
normalization as the basic principle is similar to that in RDM. He was
simply asking repeatedly as he didn't know my understanding of
normalization.
Neo - 05 Mar 2004 18:40 GMT
> ...Since all things in a row need to be of the same type...

Sorry, I meant all things in a COLUMN need to be of the same type
Neo - 04 Mar 2004 21:08 GMT
> > Then how do you explain that in TDM/XDb1, things are normalized down
> > to atomic symbols (a, b, c ...) where as a similar level of
> > normalization in RDM is impractical?
>
> Because this process can only be called "normalization" through a vigorous
> application of the imagination.

Or a failure to understand that atomic symbols (ie ^, *, &, 0, 1, 2,
a, b, c,...) are infact atomic. And also a failure to apply
normalization (process of replacing duplicates with ref to original)
until it is no longer possible.

> (And speaking of impracticality, what is it that led you to declare that
> a database consisting of many "two-columned" tables is impractical?)

Having joined the thread near its 200th posting, your understanding is
slightly off context. I do not claim a db consisting of 2-col tableS
is impractical [insert Mark Twain].

I claimed that, for some applications, in order to avoid NULLs, one
would have to resort to generic modelling, which in the extreme case
results in a db with just one 2-col table, and is impractical. If one
implements www.xdb1.com/Example/Ex076.asp they should see that pattern
emerging. After 223 postings, still no one has. Will you be the first?
Tony - 04 Mar 2004 11:05 GMT
> > You're joking, right?
>
[quoted text clipped - 14 lines]
> to atomic symbols (a, b, c ...) where as a similar level of
> normalization in RDM is impractical?

Perhaps because TDM/XDb1 was created by someone who didn't have a firm
grasp on normalisation, or perhaps even reality? ;)
Neo - 04 Mar 2004 19:33 GMT
> Perhaps because TDM/XDb1 was created by someone who didn't have a firm
> grasp on normalisation, or perhaps even reality? ;)

Perhaps, but could you devise an experiment related to representing
things, normalization and databases that would prove your conjecure?
Tony - 05 Mar 2004 12:15 GMT
> > Perhaps because TDM/XDb1 was created by someone who didn't have a firm
> > grasp on normalisation, or perhaps even reality? ;)
>
> Perhaps, but could you devise an experiment related to representing
> things, normalization and databases that would prove your conjecure?

No, I think we have to invoke Date's Incoherence Principle here: It is
not possible to treat coherently that which is incoherent.
Eric Kaun - 05 Mar 2004 13:37 GMT
> > > Perhaps because TDM/XDb1 was created by someone who didn't have a firm
> > > grasp on normalisation, or perhaps even reality? ;)
[quoted text clipped - 4 lines]
> No, I think we have to invoke Date's Incoherence Principle here: It is
> not possible to treat coherently that which is incoherent.

Agreed.
Neo - 25 Feb 2004 20:01 GMT
> > Will you implement an equivalent NULL-less solution to the example
> > posted at www.xdb1.com/Example/Ex076.asp so that we can establish
[quoted text clipped - 3 lines]
> I reserve the right to change my design later.
> No nulls were used.

It is true that no NULLs were used but you have ignored something even
more basic. Could you normalize the data? (XDb1's solution is
normalized down to each symbol, but this is impractical with RDM).

> I have to admit your test data made me feel a bit silly.

The silliness should become more apparent as one normalizes the data
using RDM.
Marshall Spight - 26 Feb 2004 03:52 GMT
> > > Will you implement an equivalent NULL-less solution to the example
> > > posted at www.xdb1.com/Example/Ex076.asp so that we can establish
[quoted text clipped - 6 lines]
> It is true that no NULLs were used but you have ignored something even
> more basic. Could you normalize the data?

In what way is the data not already fully normalized? Perhaps you
can provide me with the functional dependencies, so that I might
apply the rules of normalization to the schema.

Marshall
Neo - 26 Feb 2004 15:19 GMT
> > It is true that no NULLs were used but you have ignored something even
> > more basic. Could you normalize the data?
>
> In what way is the data not already fully normalized? Perhaps you
> can provide me with the functional dependencies, so that I might
> apply the rules of normalization to the schema.

Everything in TDM/XDb1 is functionally dependent on atomic symbols.
Normalize down to atomic symbols (a,b,c ...) if you can. Or just stop
when it becomes impractical to normalize any further in RDM.
Eric Kaun - 26 Feb 2004 18:16 GMT
> (XDb1's solution is
> normalized down to each symbol, but this is impractical with RDM).

Normalized down to each symbol? That makes absolutely no sense at all. How
is that different from no normalization at all?
Neo - 27 Feb 2004 03:51 GMT
> > XDb1's solution is normalized down to each symbol,
> > but this is impractical with RDM.
>
> Normalized down to each symbol? That makes absolutely no sense at all.

Shown below is roughly how to do it in RDM, I use "->X" syntax,
instead of IDs to make it easier to following the referential links.

T_Person
Name, Age
->john, ->28

T_CompositeSymbol
AtmSym1, AtmSym2, AtmSym3, ...
->j, ->o, ->h, ->n
->j, ->o, ->e
->2, ->8

T_AtomicSymbol
a
b
c
...

for more info, see www.xdb1.com/Basic/Symbol.asp
and www.xdb1.com/Example/Ex002.asp

> How is that different from no normalization at all?

It is exactly the opposite of no normalization.
It is near complete normalization.
Neo - 25 Feb 2004 06:36 GMT
> > And when it doesn't, RDM needs NULLs, as Codd has correctly recognized.
>
> Codd believed that incorrectly, as Date implies.

Date is correct in that there is something inherently wrong with a
model that allows NULLs.

Codd is correct in that NULLs are an integral part of RDM.

NULLs are integral part of RDM because RDM is slightly flawed.

When one substitutes "Not applicable" for NULL, the flaw is only
partially masked. NULLs kill closure. How does substituting "Not
applicable" for NULL significantly enhance closure?
Eric Kaun - 26 Feb 2004 18:19 GMT
> Date is correct in that there is something inherently wrong with a
> model that allows NULLs.

Right.

> Codd is correct in that NULLs are an integral part of RDM.

No, he's not right, and Date showed that. Instead of just repeating "Chapter
20: Missing Information" in Date's book, can you actually show me a section
or sentence or whatever that indicates that NULL is an integral part?

Codd invented relational. He stumbled in the implications of his ideas
later. He was wrong. Date showed that in his book, in the same chapter you
keep mentioning. What's not clear?

> When one substitutes "Not applicable" for NULL, the flaw is only
> partially masked. NULLs kill closure. How does substituting "Not
> applicable" for NULL significantly enhance closure?

NULLs mask far more: they mask meaning. Does NULL mean don't know, don't
care, something-but-I-don't-know-what, etc?

You're right - my "NOT APPLICABLE" was a poor choice. The "no eyes" example,
however, was a stretch. If the attribute truly isn't applicable, then you
have the wrong predicate.

However, "UNKNOWN" is valid. It's a specific meaning, that NULL loses. It's
useful in queries in many ways that NULL isn't.
Neo - 27 Feb 2004 04:32 GMT
> > Codd is correct in that NULLs are an integral part of RDM.
>
> No, he's not right, and Date showed that.

Unless Date was talking about a db with one two-column (one of which
one is auto supplied) table, Date is wrong.

> can you actually show me a section or sentence or whatever that
> indicates that NULL is an integral part?

Only the one in Date's books on pg 123 of his 6th Ed.
"Codd now regards nulls as an integral part of the relational model"

>