Database Forum / General DB Topics / DB Theory / March 2004
object algebra
|
|
Thread rating:  |
Jean Morissette - 20 Feb 2004 14:35 GMT Hi, In query processing, a ODBMS parse a OQL request in a query tree corresponding to object algebra. But, unlike relational algebra, there is no standard object algebra. So, could you help me to find some good object algebra please (example: link to research university)? Thanks Jean Morissette
Tom Hester - 20 Feb 2004 15:59 GMT Try a google search on 'object-oriented algebra'. I did and got 153,000 hits.
> Hi, > In query processing, a ODBMS parse a OQL request in a query tree [quoted text clipped - 3 lines] > Thanks > Jean Morissette Christopher Browne - 20 Feb 2004 16:00 GMT > Hi, > In query processing, a ODBMS parse a OQL request in a query tree > corresponding to object algebra. But, unlike relational algebra, there > is no standard object algebra. So, could you help me to find some good > object algebra please (example: link to research university)? The only work I am aware of that presents any sort of "object calculus" is _A Theory of Objects_, by Martin Abadi and Luca Cardelli.
Benjamin Pierce's book on Category Theory is probably also somewhat relevant.
But neither of these appears to be necessarily relevant to OQL.
The fact that, despite 20-odd years of publishing on "object oriented" programming, there are only a very few books that try to treat the relationships between objects in a robust mathematical manner, should be quite disturbing.
That seems to me to be an even worse state of affairs than the typical paucity of interest in sound theory amongst those that use databases.
 Signature select 'cbbrowne' || '@' || 'acm.org'; http://cbbrowne.com/info/wp.html Twice five syllables Plus seven can't say much but That's haiku for you.
Alfredo Novoa - 20 Feb 2004 16:35 GMT >Benjamin Pierce's book on Category Theory is probably also somewhat >relevant. I haven't found the relevance.
>The fact that, despite 20-odd years of publishing on "object oriented" >programming, there are only a very few books that try to treat the >relationships between objects in a robust mathematical manner, should >be quite disturbing. The Relational Model treat the relationships between objects in a robust mathematical manner.
A DB relation is a relation among objects.
Regards Alfredo
Neo - 21 Feb 2004 00:44 GMT > The Relational Model treat the relationships between objects in a > robust mathematical manner. For even a small collection of things, the number of permutations that they could be related is huge. To deal the complexity, humans have devised various methods. Thus far, RDM has been the most popular because of its robustness and applicability to a large range of applications. RDM, like all methods, imposes rules on the collection of things. On the plus side, the rules make some representations/operations easier and more robust (ie those where data fits neatly in tables). On the minus side, the rules make other representations/operations more difficult (ie trees). The deficiencies of RDM to manage things for some range of applications is why there are OODBs, XML, MV and XDb. XDb is a partial/experimental implementation of TDM. TDM is a more general model than RDM. This is why it can equally manage things arranged as either tables or trees, has OO characteristics, no NULLs, transitive closure and fits on a floppy to boot :)
Mikito Harakiri - 21 Feb 2004 01:05 GMT > and fits on a floppy to boot :) Why not call it FDb, then?
Marshall Spight - 21 Feb 2004 16:34 GMT > On the plus side, the rules make some > representations/operations easier and more robust (ie those where data > fits neatly in tables). On the minus side, the rules make other > representations/operations more difficult (ie trees). Can you quantify this in some way? Or even just describe the additional difficulty better?
> TDM is a more general model than RDM. Can you quantify that? Since the RDM is fully general, it seems a difficult task.
Marshall
Neo - 23 Feb 2004 05:41 GMT > > TDM is a more general model than RDM. > > Since the RDM is fully general, it seems a difficult task. RDM isn't fully general. If it is, why does it need NULLs? RDM's fundamental design ensures NULLs will occur in some applications.
IMO, one simple way to judge which data model maybe more generic, is to count the occurance of NULLs. The model which utilizes the least NULLs is probably more general. The presence of NULLs is usually a red flag of some type of mismatch (chapter 20 of Date's "Intro to DB Systems"). TDM has no NULLs. In theory, RDM can eliminate all NULLs, but requires non-standard techniques that are impractical (ie generic modelling, where all the data is in one table with one column). In TDM, the techinque to model data that is "rectangularish" or otherwise is the same and doesn't result in NULLs.
Another more significant way to judge which data model is more generic, is to analyze degree of closure over basic operations (intersection, union, negation). Under RDM, closure requires meeting rather strict criteria (chapter 6) in comparision to criteria for closure in TDM. NULLs hinder closure which in turn hinders recursion.
According to Date's 6th Ed, "missing information is not fully understood", "no fully satisfactory solution is known", "incorporation into model is premature" but "Codd now regards NULLs as an integral part of RDM".
> Can you quantify that? If we model various data examples, we should find less NULLs with TDM than with RDM. For example, modelling 10 persons, each with different properties. Or the following problem:
Allow user to create any hierarchy of things. Each thing in the hierarchy can be of different type/class. Each thing can have 1 to many parents in the hierarchy. For all possible combinations of 2 children in the hierarchy find the closest ancestor.
A solution to the above using TDM is shown at www.xdb1.com/Example/Ex075.asp Note, that although the example shows a hierarchy of similar things and each thing has exactly 2 parents, the solution works with hierarchy of different kinds of things with different number of parents.
If someone can show a solution for the above problem using RDM, the genericness of the models will become clearer.
Eric Kaun - 23 Feb 2004 12:20 GMT > > > TDM is a more general model than RDM. > > > > Since the RDM is fully general, it seems a difficult task. > > RDM isn't fully general. If it is, why does it need NULLs? It is fully general, and it doesn't need nulls.
> RDM's fundamental design ensures NULLs will occur in some > applications. No, it doesn't. Which applications are you talking about?
> IMO, one simple way to judge which data model maybe more generic, is > to count the occurance of NULLs. The model which utilizes the least > NULLs is probably more general. Not sure whether this has any relevance, but certainly the relational community agrees that nulls are a bad idea.
> The presence of NULLs is usually a red > flag of some type of mismatch (chapter 20 of Date's "Intro to DB > Systems"). TDM has no NULLs. In theory, RDM can eliminate all NULLs, > but requires non-standard techniques that are impractical (ie generic > modelling, where all the data is in one table with one column). So in what way does TDM eliminate nulls in "standard techniques", whatever that means?
> Another more significant way to judge which data model is more > generic, is to analyze degree of closure over basic operations > (intersection, union, negation). Under RDM, closure requires meeting > rather strict criteria (chapter 6) in comparision to criteria for > closure in TDM. What strict criteria? I just finished reading the 8th edition of his book, and have no idea what you mean. What are the "criteria for closure" in TDM? I simply thought closure was a result of operations over type T yielding values of type T.
> NULLs hinder closure which in turn hinders recursion. Agreed, but relational doesn't allow nulls.
> According to Date's 6th Ed, "missing information is not fully > understood", "no fully satisfactory solution is known", "incorporation > into model is premature" but "Codd now regards NULLs as an integral > part of RDM". Codd was wrong in that statement, and many relational proponents support the non-use of nulls.
> If we model various data examples, we should find less NULLs with TDM > than with RDM. For example, modelling 10 persons, each with different > properties. Or the following problem: Each with different properties? You're simply talking about an attribute of type PERSON, and the existence of multiple subtypes of PERSON. What does TDM allow you to do with those differing properties?
> Allow user to create any hierarchy of things. > Each thing in the hierarchy can be of different type/class. > Each thing can have 1 to many parents in the hierarchy. > For all possible combinations of 2 children in the hierarchy > find the closest ancestor.
> A solution to the above using TDM is shown at > www.xdb1.com/Example/Ex075.asp It's impossible to tell from that page what's going on.
> Note, that although the example shows a hierarchy of similar things > and each thing has exactly 2 parents, the solution works with [quoted text clipped - 3 lines] > If someone can show a solution for the above problem using RDM, the > genericness of the models will become clearer. This seems obvious - I don't have time right now (at least until I see a real explanation of the XDb "example"), but a Thing relation and a ThingParent relation would allow any number of parents as well. Adequate domain support would enable the Thing relation to hold any type you like, including subtypes - and the type could just be ANYTHING. Finding all possible combinations of 2 children is ludicrously simple.
- erk
Neo - 23 Feb 2004 22:36 GMT > > RDM isn't fully general. If it is, why does it need NULLs? > > It is fully general, and it doesn't need nulls. According to Date "Codd now regards NULLs as an integral part of relational model". Codd is correct that NULLs are required in RDM, unless all your data fits neatly in rectangular tables which is not the general case, or unless one resorts to unpractical methods such as generic modelling (ie all data in one table with one column).
Eric Kaun - 24 Feb 2004 13:11 GMT > > > RDM isn't fully general. If it is, why does it need NULLs? > > [quoted text clipped - 5 lines] > the general case, or unless one resorts to unpractical methods such as > generic modelling (ie all data in one table with one column). Do you make some distinction between relational and RDM, and if so, what is RDM?
Tables are not relations, and neither of them are "rectangular." You can display, for example, a 4-dimensional tessaract (hypercube) as a table if you like - that doesn't mean it's rectangular.
Generic modeling is a contradiction in terms - all data in one table with one column models nothing, unless you're discussing meta-modeling of some sort.
The general case is for data to fit into relations - if you can't do that, you're being exceptionally sloppy with what you're saying (and failing to say) about your company's data. You can be sufficiently general through the attribute types (and subtypes!) that your relations are defined over.
- Eric
Christopher Browne - 24 Feb 2004 19:08 GMT Oops! "Eric Kaun" <ekaun@yahoo.com> was seen spray-painting on a wall:
> Tables are not relations, and neither of them are "rectangular." You can > display, for example, a 4-dimensional tessaract (hypercube) as a table if > you like - that doesn't mean it's rectangular. You have to be prepared to forgive people at least a little for making this mistaken assumption.
After all:
- xBase was often described as a "relational database" despite the fact that it certainly wasn't;
- I haven't seen much evidence of commercial SQL system vendors having produced systems that are convenient to use with data that isn't shaped pretty blindly like a "table."
Perhaps a relational data representation _should_ be analagous to Lisp structures or Prolog facts, and therefore be able to be of pretty much any shape. But in the absence of conspicuous implementations of such, it shouldn't be surprising for people to make the "table" mistake...
 Signature wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','ntlug.org'). http://www3.sympatico.ca/cbbrowne/multiplexor.html
:FATAL ERROR -- YOU ARE OUT OF VECTOR SPACE Eric Kaun - 24 Feb 2004 19:05 GMT > Oops! "Eric Kaun" <ekaun@yahoo.com> was seen spray-painting on a wall: > > Tables are not relations, and neither of them are "rectangular." You can [quoted text clipped - 18 lines] > it shouldn't be surprising for people to make the "table" mistake... > :FATAL ERROR -- YOU ARE OUT OF VECTOR SPACE True enough - it's just so commonly used as a slam against relational, as if it's not "multi-dimensional" enough, that it disturbs me when I see it. Somehow the fact that reality is messy bleeds into assumptions that our code and/or data have to be messy too, which is just giving up (and professional malpractice besides).
- Eric
Joe \ - 24 Feb 2004 22:06 GMT > > Perhaps a relational data representation _should_ be analagous to Lisp > > structures or Prolog facts, and therefore be able to be of pretty much [quoted text clipped - 10 lines] > and/or data have to be messy too, which is just giving up (and professional > malpractice besides). If relations are necessarily 2D rectangular tables, then objects are nothing but mind-numbingly, insanely baroque 1D bit vectors. Therefore, even a relation containing objects is still a 2D table.
-- Joe Foster <mailto:jlfoster%40znet.com> DC8s in Spaace: <http://www.xenu.net/> WARNING: I cannot be held responsible for the above They're coming to because my cats have apparently learned to type. take me away, ha ha!
Neo - 25 Feb 2004 04:33 GMT > True enough - it's just so commonly used as a slam against relational, as if > it's not "multi-dimensional" enough, that it disturbs me when I see it. > Somehow the fact that reality is messy bleeds into assumptions that our code > and/or data have to be messy too, which is just giving up (and professional > malpractice besides). Why not prove those people wrong by providing a clean (NULL-less) solution to the example shown at www.xdb1.com/Example/Ex076.asp ?
Eric Kaun - 26 Feb 2004 18:09 GMT > > True enough - it's just so commonly used as a slam against relational, as if > > it's not "multi-dimensional" enough, that it disturbs me when I see it. [quoted text clipped - 4 lines] > Why not prove those people wrong by providing a clean (NULL-less) > solution to the example shown at www.xdb1.com/Example/Ex076.asp ? Prove what wrong? Exactly what problem are you trying to solve? I get the silly idea that a set of relations THING, INSTANCE, and RELATIONSHIP with some fairly generic attributes would do the trick. But again: you've said nothing about the problem. That page says nothing about the problem or the domain. It's bizarre, to say the least - have you come across a use for this in real business problems?
At best, it seems like some sort of thought-sketchpad.
Bob Badour - 26 Feb 2004 18:13 GMT > > > True enough - it's just so commonly used as a slam against relational, > as if [quoted text clipped - 16 lines] > > At best, it seems like some sort of thought-sketchpad. Eric, you are wasting your time. Neo combines stupidity and ignorance with free association. The most polite thing anyone can say about Neo is he is random.
Neo - 27 Feb 2004 00:37 GMT > > > so commonly used as a slam against relational, as if > > > it's not "multi-dimensional" enough, that it disturbs me when I see it. [quoted text clipped - 3 lines] > > Prove what wrong? Prove to people like me that RDM is "multi-dimensional" enough.
> Exactly what problem are you trying to solve? I already have a solution to a specific problem (Ex076) which I believe begins to exceed RDM's scope. I am asking you provide an equivalent (normalized, NULL-less) solution so we can determine if RDM is "multi-dimensional" enough.
Some persons, like Bob and Alfredo, believe there is only one correct model for representing things: RDM. I contend that other models are possible. Each model provides different advantages and disadvantages. I believe all models (RDM, TDM, etc) are subsets of relational algebra. I also contend that TDM is closer to relational algebra than RDM. A more general model(TDM) provides a more complex solution to a problem that is within the scope of a more specific model(RDM). The more general model(TDM) gains advantage as the complexity of problems increases. You and I disagreed on the veracity of the above. I believe Ex076 begins to exceed RMD's scope. I asked you to represent the same data (without NULLs and normalized) as shown in Ex076 and generate the same report (common ancestors). By comparing the solutions, I contend one will conclude that TDM is more general (but not necessarily the best for most applications). As of yet, no one has provided an equivalent representation of the data with RDM. Will you be the first?
Neo - 27 Feb 2004 01:07 GMT > Exactly what problem are you trying to solve? I get the > silly idea that a set of relations THING, INSTANCE, and RELATIONSHIP with > some fairly generic attributes would do the trick. Yes, your idea of resorting to generic modelling is correct. By using a few generic tables it is possible, but soon becomes impractical. For some problems (not very common), you would ultimately have to resort to just one two-columned table. RDM's methods and related tools don't work very well with just a single two-columned table. The fact that no on has presented an equivalent to the simple problem (Ex076) using RDM may be an indicator of how impractical it is.
Mikito Harakiri - 27 Feb 2004 01:30 GMT > The fact that no > on has presented an equivalent to the simple problem (Ex076) using RDM > may be an indicator of how impractical it is. How is your problem is different from finding nearest common ancestor in BOM hierarchy? If this hint is not detailed enough for you, may I suggest for you to learn how to represent tree in the relational database instead of broadcasting nonsence about generic table with 2 columns?
Bob Badour - 27 Feb 2004 01:41 GMT > > The fact that no > > on has presented an equivalent to the simple problem (Ex076) using RDM [quoted text clipped - 4 lines] > you to learn how to represent tree in the relational database instead of > broadcasting nonsence about generic table with 2 columns? You may suggest all you want, but Neo has proved impervious to suggestion of any kind.
Mikito Harakiri - 27 Feb 2004 01:49 GMT > You may suggest all you want, but Neo has proved impervious to suggestion of > any kind. He is quoting Date. That's some progress. Perhaps we can expect him to stop shameless product promotion?
Bob Badour - 27 Feb 2004 01:53 GMT > > You may suggest all you want, but Neo has proved impervious to suggestion > of > > any kind. > > He is quoting Date. That's some progress. Perhaps we can expect him to stop > shameless product promotion? You might as well expect the leopard to change his spots.
Neo - 27 Feb 2004 01:45 GMT > That page says nothing about the problem or the domain. > At best, it seems like some sort of thought-sketchpad. "This example represents things in a command hierarchy and generates a report indicating the closest common commander between all possible pairs. For example, john and mary obey the army. Army is their closest common commander. The first figure shows the various types of things and their instances. The second figure shows the command hierarchy starting with god."
> have you come across a use for this in real business problems? A related problem was a SQL-Server based system that forcasted the utility requirements for Intel's semiconductor production facilities based on various types of equipment throughout their plant fed by a networks of pipes and conduits (hierarchies with different kinds of things).
Bob Badour - 24 Feb 2004 21:34 GMT > Oops! "Eric Kaun" <ekaun@yahoo.com> was seen spray-painting on a wall: > > Tables are not relations, and neither of them are "rectangular." You can [quoted text clipped - 3 lines] > You have to be prepared to forgive people at least a little for making > this mistaken assumption. This has been explained so many times to Neo that no forgiveness is warranted.
Neo - 25 Feb 2004 04:22 GMT > > Tables are not relations, and neither of them are "rectangular." You can > > display, for example, a 4-dimensional tessaract (hypercube) as a table if > > you like - that doesn't mean it's rectangular. > > You have to be prepared to forgive people at least a little for making > this mistaken assumption. I realize that things can be presented in mulitple formats regardless of how it is actually stored. For example XDb1's GUI presents things in tables, trees or sentences but the things themselve aren't stored that way. What I am saying is that the internal format of RDM is "rectangularish" and while it can also display things as tables, trees and sentences, it is clumsier at some things like trees compared to tables. TDM/XDb1's internal structure is more general and therefore more neutral to either. In fact RDM's "rectangularish" internal structure (aka relation) is the cause of NULLs. If one doubts that RDM's "rectangularish" relation makes it clumsy at trees, one might try to replicate the example shown at www.xdb1.com/Example/Ex076.asp
> - xBase was often described as a "relational database" despite the > fact that it certainly wasn't; All information systems are relational to some degree. RDM is simply one of the closest to pure relational algebra.
> - I haven't seen much evidence of commercial SQL system vendors > having produced systems that are convenient to use with data > that isn't shaped pretty blindly like a "table." While XDb1 is only experimental, it does provide a way to manage things thru tables, trees and sentences.
> Perhaps a relational data representation _should_ be analagous to Lisp > structures or Prolog facts, and therefore be able to be of pretty much > any shape. But in the absence of conspicuous implementations of such, > it shouldn't be surprising for people to make the "table" mistake... XDb1 allows things of variable shapes to be entered via table, tree and english-like sentences. For example, the following sentences creates the equivalent of a relation with 2 tuples.
person isa thing. john isa person. mary isa person.
For more details, see www.xdb1.com/NLI/Default.asp and www.xdb1.com/Example/Ex001.asp
Eric Kaun - 26 Feb 2004 18:13 GMT > What I am saying is that the internal format of RDM is > "rectangularish" and while it can also display things as tables, trees > and sentences, it is clumsier at some things like trees compared to > tables. RDM has no internal format. That's physical implementation. Read about TransRelational (subtype of Tarrin transforms) sometime to see just how different (and useful!) a clever physical scheme can be.
> TDM/XDb1's internal structure is more general and therefore > more neutral to either. More general? Not really. How? What is the structure, anyway?
> While XDb1 is only experimental, it does provide a way to manage > things thru tables, trees and sentences. "Managing things" isn't the end goal of relational. Deductive correctness is (among other things). You can "manage", whatever that means, through an arbitrary number of mechanisms of whatever shape you like.
> XDb1 allows things of variable shapes to be entered via table, tree > and english-like sentences. For example, the following sentences [quoted text clipped - 3 lines] > john isa person. > mary isa person. So how is this better than a relation with 2 tuples?
Neo - 27 Feb 2004 02:16 GMT > RDM has no internal format. Per Date, RDM's fundamental representation unit, a relation, consists of 1) a heading of a fixed set of attributes 2) a body consisting of tuples (rows), each containing of values related to the heading. While the internal format doesn't prevent RDM (or any other model) from representing tables, trees, sentences, 4-D, etc, the internal format does have an impact on which structure are easier (tables) and which are more difficult (trees, Ex076). And in RDM's case, the internal format make it impossible to guarantee no NULLs will ever occur, unlike TDM which can make that guarantee.
> > XDb1 allows things of variable shapes to be entered via table, tree > > and english-like sentences. For example, the following sentences [quoted text clipped - 4 lines] > > So how is this better than a relation with 2 tuples? Not much difference, if there are only a few (ultimately 1) tables in the db (ie T_Thing, T_SubVerbObj), however such an implementation is impractical in RDM.
Neo - 24 Feb 2004 20:30 GMT > Do you make some distinction between relational and RDM, > and if so, what is RDM? RDM (data expressed as collections of tuples contrained by the header, etc) is less general than relational [ie ((((ab)cd)e)f)(gh)]. Any time you add rules to a system, it becomes less general. RDM adds "rectangularish" rules to relational.
> Tables are not relations, and neither of them are "rectangular." A relation (table) by definition is "rectangularish". Per Date, a relation consists of 1) a heading of a fixed set of attributes 2) a body consisting of tuples (rows), each containing of values related to the heading.
You don't see anything "rectangularish" in the above ???
Because the fundamental building block in RDM is "rectangularish", dealing with missing data, trees, and complex structure is more difficult and sometimes impractical, although not impossible, since you have to use more smaller blocks to get the desired shape or eliminate NULLs.
> You can display, for example, a 4-dimensional tessaract (hypercube) > as a table if you like - that doesn't mean it's rectangular. It is not that you cant represent 4-th dimensional, trees and complex data structures using the "rectangularish" building block provided by RDM, it's just more difficult and sometimes impractical.
> The general case is for data to fit into relations Reality doesn't always provide data that fits "rectangularish" relations. And when it doesn't, RDM needs NULLs, as Codd has correctly recognized.
> You can be sufficiently general through the attribute types > (and subtypes!) that your relations are defined over. Because the basic representational block in RDM is "rectangularish", to represent 10 persons each with different properties without NULLs, one has to resort to additional tables. The need to "subtype attributes" in this case is artifical and brought on by NULLs, which RDM creates, not reality.
Eric Kaun - 24 Feb 2004 21:49 GMT [...]
> A relation (table) by definition is "rectangularish". Per Date, a > relation consists of 1) a heading of a fixed set of attributes 2) a > body consisting of tuples (rows), each containing of values related to > the heading. > > You don't see anything "rectangularish" in the above ??? No - and seeing has nothing to do with it. The visual representation, for display on screen and paper, is "rectangularish." But it's certainly no more "rectangularish" than the XDb example - that can easily be represented as relations, and hence as "rectangles."
I guess I don't object to the characterization - just to the implication that "rectangles" are inferior to... something else.
> Because the fundamental building block in RDM is "rectangularish", > dealing with missing data, trees, and complex structure is more > difficult and sometimes impractical, although not impossible, since > you have to use more smaller blocks to get the desired shape or > eliminate NULLs. I don't agree with its impracticality. When missing values are involved you find yourself saying things you never intended, and not saying things you need to say. The meaning of your relations is undermined.
> It is not that you cant represent 4-th dimensional, trees and complex > data structures using the "rectangularish" building block provided by > RDM, it's just more difficult and sometimes impractical. I disagree that it's more difficult. Trees are easy. What other "complex data structures" does XDb handle better?
> Reality doesn't always provide data that fits "rectangularish" > relations. > And when it doesn't, RDM needs NULLs, as Codd has correctly > recognized. Codd believed that incorrectly, as Date implies. You can find many writings on the web concerning the danger of missing values. Look on dbdebunk.com for starters.
> Because the basic representational block in RDM is "rectangularish", > to represent 10 persons each with different properties without NULLs, > one has to resort to additional tables. The need to "subtype > attributes" in this case is artifical and brought on by NULLs, which > RDM creates, not reality. Reality is what it is; we're trying to model and reason about it. That says precisely nothing about the "need" to somehow correlate the "shape" of our data with the "shape" of reality. My fingers are cringing just typing this, it's so silly. We shape things according to our need to reason about them. Whatever the "shapes" of data in XDb, I have yet to see a coherent explanation, or to understand how its "shape" makes it easier to reason about.
Nulls are not created, for example, if you have domains with special values (e.g. UNKNOWN). Its semantics are then clear, and this avoids many issues with nulls. With proper domain support, this is trivial.
- erk
Neo - 25 Feb 2004 06:23 GMT > > It is not that you cant represent 4-th dimensional, trees and complex > > data structures using the "rectangularish" building block provided by > > RDM, it's just more difficult and sometimes impractical. > > I disagree that it's more difficult. Trees are easy. Will you implement an equivalent NULL-less solution to the example posted at www.xdb1.com/Example/Ex076.asp so that we can establish this?
Marshall Spight - 25 Feb 2004 08:22 GMT > > > It is not that you cant represent 4-th dimensional, trees and complex > > > data structures using the "rectangularish" building block provided by [quoted text clipped - 5 lines] > posted at www.xdb1.com/Example/Ex076.asp so that we can establish > this? Here's my first cut. I reserve the right to change my design later. No nulls were used. I have to admit your test data made me feel a bit silly.
create table EX76 (subject varchar(80), relator varchar(80), object varchar(80)); insert into EX76 values ('obeys', 'isa', 'relator'); insert into EX76 values ('god', 'isa', 'thing'); insert into EX76 values ('god', 'equals', 'god'); insert into EX76 values ('it', 'is', 'obeys'); insert into EX76 values ('force', 'isa', 'thing'); insert into EX76 values ('army', 'isa', 'force'); insert into EX76 values ('church', 'isa', 'thing'); insert into EX76 values ('trinity', 'isa', 'church'); insert into EX76 values ('person', 'isa', 'thing'); insert into EX76 values ('john', 'isa', 'person'); insert into EX76 values ('mary', 'isa', 'person'); insert into EX76 values ('luke', 'isa', 'person'); insert into EX76 values ('age', 'isa', 'thing'); insert into EX76 values ('35', 'isa', 'age'); insert into EX76 values ('john', 'is', '35'); insert into EX76 values ('weight', 'isa', 'thing'); insert into EX76 values ('130', 'isa', 'weight'); insert into EX76 values ('mary', 'is', '130'); insert into EX76 values ('color', 'isa', 'thing'); insert into EX76 values ('red', 'isa', 'color'); insert into EX76 values ('luke', 'is', 'red'); insert into EX76 values ('dog', 'isa', 'thing'); insert into EX76 values ('fido', 'isa', 'dog'); insert into EX76 values ('computer', 'isa', 'thing'); insert into EX76 values ('laptop1', 'isa', 'computer'); insert into EX76 values ('army', 'obeys', 'god'); insert into EX76 values ('trinity', 'obeys', 'god'); insert into EX76 values ('john', 'obeys', 'army'); insert into EX76 values ('mary', 'obeys', 'army'); insert into EX76 values ('mary', 'obeys', 'trinity'); insert into EX76 values ('luke', 'obeys', 'trinity'); insert into EX76 values ('laptop1', 'obeys', 'john'); insert into EX76 values ('laptop1', 'obeys', 'mary'); insert into EX76 values ('fido', 'obeys', 'john'); insert into EX76 values ('fido', 'obeys', 'mary'); insert into EX76 values ('fido', 'obeys', 'luke');
Marshall
Mikito Harakiri - 25 Feb 2004 17:18 GMT > > > > It is not that you cant represent 4-th dimensional, trees and complex > > > > data structures using the "rectangularish" building block provided by [quoted text clipped - 46 lines] > insert into EX76 values ('fido', 'obeys', 'mary'); > insert into EX76 values ('fido', 'obeys', 'luke'); And now please write a query that returns an aggregate age of all persons. In this schema with SQL, or with Neo's school science fair project.
Neo - 26 Feb 2004 01:09 GMT > And now please write a query that returns an aggregate age of all persons. > In this schema with SQL, or with Neo's school science fair project. // Psuedo code to calc age of all unique persons under selected node // Step thru unique descendant's of God int ageTotal = 0; int* pDesc_a[kMAX_TREE_DEPTH] = {pGod}; while (int i = T_Relatives(pDesc_a, kCREATURE, kUNIQUE)){ int* pThing = pDesc_a[i]; if (IsAncestorOf(pPersonCls, pThing)){ int* pAge = T_Property_get(pThing, pAgeCls); if (pAge){ sAge = T_Symbol_get(pAge); ageTotal = ageTotal + ConvertSymbolToInteger(sAge); } } }
XDb1 has transitive closure. For actual code to a related problem, see www.xdb1.com/Example/Ex075b.asp
Mikito Harakiri - 26 Feb 2004 01:38 GMT <FORTRAN code snipped>
> XDb1 has transitive closure. XDb1 has "amaterish" transitive closure
Neo - 26 Feb 2004 16:02 GMT > > XDb1 has transitive closure. > > XDb1 has "amaterish" transitive closure Why not show how amaterish it is by providing an alternate solution?
Mikito Harakiri - 26 Feb 2004 17:58 GMT > > > XDb1 has transitive closure. > > > > XDb1 has "amaterish" transitive closure > > Why not show how amaterish it is by providing an alternate solution? FYI "closest common commander" is called "nearest common ancestor" in the regular literature.
I'm not sure what exactly your challenge is, but finding nearest common ancestor in a tree is trivial: 1. Select path to the root from node A. 2. Select path to the root from B. 3. Intersect 1 and 2. 4. Find the node most distant from the root in the result.
Neo - 27 Feb 2004 02:39 GMT > FYI "closest common commander" is called "nearest common ancestor" in the > regular literature. Ancestor is a generic term. For some types of relationships, a more specific term can be used. Since the relator between things in the example hierarchy is "obeys" (ie fido obeys luke, luke obeys trinity, trinity obeys god) the ancestor could be described more specifically as commander. If one runs the report, it actually says "common ancestor", since XDb1 has not been programmed to accept or use a more specific term for ancestor in a specific hierarchy.
Neo - 27 Feb 2004 03:14 GMT > > > XDb1 has "amaterish" transitive closure > > Why not show how amaterish it is by providing an alternate solution? > but finding nearest common ancestor in a tree is trivial: I can't tell by the provided 4 steps, if the alternate solution is more or less "amaterish".
> I'm not sure what exactly your challenge is: 1) Model the data shown at www.xdb1.com/Example/Ex076.asp The data is described quite accurately by sentences like "john isa person". The two figures show views of the final data as trees. There isn't much table-like data, so nothing was displayed in the grid. 1a) You don't have to model the relator "obeys" 1b) All tables must be NULL-less (XDb1's data is NULL-less). 1c) Data must be normalized (Don't normalize down to atomic symbols as XDb1 does, but normalize things like john, mary, fido, etc).
2) Create a "Nearest Common Ancestor Report" for things in the command hierarchy. I have copied the report below from the webpage. The order of things can be different.
Common Ancestor Report for 'god' ThingX ThingY CmnAnc Dist army john army 1 army laptop1 army 2 army fido army 2 army mary army 1 army trinity god 2 army luke god 3 john laptop1 john 1 john fido john 1 john mary army 2 john trinity god 3 john luke god 4 laptop1 fido john 2 laptop1 mary mary 1 laptop1 trinity trinity 2 laptop1 luke trinity 3 fido mary mary 1 fido trinity trinity 2 fido luke luke 1 mary trinity trinity 1 mary luke trinity 2 trinity luke trinity 1 Time elapsed: 15 msec
XDb1's representation/gui/code is generic enough so that someone else could enter a completely different hierarchy consisting of different things and yet the report would run properly. Your code/SQL should have a similar level of genericness.
Neo - 28 Feb 2004 05:51 GMT I just realized that I don't want you to explicitly represent the relators "obey", "isa" or "is". They can be implied.
The following might serve as base tables to implement Ex076. T_CmdHier T_God T_ArmedForce T_Church T_Person T_Dog T_Computer T_Age T_Weight T_Color
Note: Many of the tables may have just one tuple in order to implement the example.
Marshall Spight - 26 Feb 2004 03:44 GMT > > Here's my first cut. I reserve the right to change my design later. No > > nulls were used. I have to admit your test data made me feel a bit silly. [quoted text clipped - 40 lines] > And now please write a query that returns an aggregate age of all persons. > In this schema with SQL, or with Neo's school science fair project. Here's the results from a test run I did:
test=> select sum(to_number(object,'99')) from EX76 where object in ( select subject from EX76 where relator = 'isa' AND object = 'age'); sum ----- 35 (1 row)
Please note that I am not advocating building schemata that look like this. I'm just pointing out that an existing relational database can handle it.
Compare the length and clarity of the above SQL with Neo's pseudocode.
Marshall
Joe \ - 26 Feb 2004 04:10 GMT > test=> select sum(to_number(object,'99')) from EX76 where object in > ( select subject from EX76 where relator = 'isa' AND object = 'age'); [quoted text clipped - 7 lines] > > Compare the length and clarity of the above SQL with Neo's pseudocode. Neo's silly scheme doesn't scale. What happens if person X is 98 years old and person Y weighs 98 pounds?
-- Joe Foster <mailto:jlfoster%40znet.com> "Regged" again? <http://www.xenu.net/> WARNING: I cannot be held responsible for the above They're coming to because my cats have apparently learned to type. take me away, ha ha!
Neo - 26 Feb 2004 18:05 GMT > Neo's silly scheme doesn't scale. > What happens if person X is 98 years old and person Y weighs 98 pounds? You are correct, the scheme doesn't indicate the quantity's units. Not because it can't, but because the main intent of the example was to demonstrate something else. Two way to specify quantity units are shown below:
Method 1 -------------- qty isa thing. 98 isa qty.
unit isa thing. pound isa unit. year isa unit.
age isa thing. * isa age. (* creates an "unnamed" thing) it isa 98. it isa year. (it inherents its "name", ie "98 year")
weight isa thing. * isa weight. * isa 98. it isa pound.
Method 2 -------------- qty isa thing. 98 isa qty.
unit isa thing. pound isa unit. year isa unit.
age isa thing. 98yr isa age. 98yr is 98. 98yr is year.
weight isa thing. 98lb isa weight. 98lb is 98. 98lb is pound.
Neo - 26 Feb 2004 17:34 GMT > > And now please write a query that returns an aggregate age of all persons. > > test=> select sum(to_number(object,'99')) from EX76 where object in > ( select subject from EX76 where relator = 'isa' AND object = 'age'); > > Compare the length and clarity of the above SQL with Neo's pseudocode. If I use unnormalized data, the result would be similar. Can you use normalized data and solve the problem with similar level of genericness so that we can compare "length and clarity"?
Are you advocating unnormalized data? If not why provide unnormalized solutions?
Marshall Spight - 27 Feb 2004 03:56 GMT > > > And now please write a query that returns an aggregate age of all persons. > > [quoted text clipped - 9 lines] > Are you advocating unnormalized data? If not why provide unnormalized > solutions? It's not clear to me you know what "normalized" means. Can you be specific about what normalization rules you are referring to? In what way is my schema not normalized?
Marshall
Neo - 28 Feb 2004 05:48 GMT > It's not clear to me you know what "normalized" means. Can you be > specific about what normalization rules you are referring to? In > what way is my schema not normalized? Normalization: The process of replacing duplicates things with a reference to the original thing.
For example, given "john isa person" and "john obeys army", one observes that the "john" in the second sentence is a duplicate of "john" in the first sentence. Using the means provided by your system, the second sentence should be stored as "->john obeys army".
Another example, given "bob" one observes that the second "b" is a duplicate of the first "b" and therefore should be normalized as "bo->b". I don't want you to normalized this far, even though Ex076 is.
The exact method of normalization and to what extent is practical is dependent on the data model and its implementation.
Sorry, I just realized that I don't want you to explicitly represent the relators "obey", "isa" or "is". They can be implied.
The following might serve as base tables to implement Ex076. T_CmdHier T_God T_ArmedForce T_Church T_Person T_Dog T_Computer T_Age T_Weight T_Color
Note: Many of the tables may have just one tuple in order to implement the example.
Eric Kaun - 03 Mar 2004 13:08 GMT > Normalization: The process of replacing duplicates things with a > reference to the original thing. [quoted text clipped - 3 lines] > "john" in the first sentence. Using the means provided by your system, > the second sentence should be stored as "->john obeys army". You're joking, right? In what sense have you saved work, or eliminated duplicates? You'll have 2 sentences with john as the subject.
> Another example, given "bob" one observes that the second "b" is a > duplicate of the first "b" and therefore should be normalized as > "bo->b". You're joking, right? Please tell me you're joking.
> The exact method of normalization and to what extent is practical is > dependent on the data model and its implementation. No, it's dependent on neither of those.
Neo - 04 Mar 2004 01:14 GMT > You're joking, right? No, not joking. See www.xdb1.com/Basic/Symbol.asp, www.xdb1.com/GUI/Labels.asp and www.xdb1.com/HowTo/Find.asp
> You're joking, right? Please tell me you're joking. No, not joking. See www.xdb1.com/Basic/Symbol.asp, www.xdb1.com/GUI/Labels.asp and www.xdb1.com/HowTo/Find.asp
> > The exact method of normalization and to what extent is practical is > > dependent on the data model and its implementation. > > No, it's dependent on neither of those. Then how do you explain that in TDM/XDb1, things are normalized down to atomic symbols (a, b, c ...) where as a similar level of normalization in RDM is impractical?
Dawn M. Wolthuis - 04 Mar 2004 01:22 GMT > > You're joking, right? > [quoted text clipped - 14 lines] > to atomic symbols (a, b, c ...) where as a similar level of > normalization in RDM is impractical? I gotta admit that I thought you were joking too, Neo. I don't know what you mean by normalize and why not go down to the 1's & 0's -- what does the symbol "normalization" gain you? Perhaps I haven't read your web sites close enough, but at this point I'm definitely not tracking. --dawn
Neo - 04 Mar 2004 22:47 GMT > I don't know what you mean by normalize In TDM/XDb1, normalize means to replace a duplicate thing with a reference to the orginal thing. For example, given "john isa person" and "john obeys army", one observes that the "john" in the second sentence is a duplicate of "john" in the first sentence. The second sentence becomes "->john obeys army". Another example, given "bob" one observes that the second "b" is a duplicate of the first "b" and therefore can be normalized as "bo->b". Note, in TDM/XDb1 all the following are different things: the person named "john", the word "john" and each symbol of the word "john" The exact method of normalization and to what extent is practical is dependent on the data model and its implementation. In RDM, keys or IDs are used to implement a reference.
One could normalize symbols in RDM as shown below, but it is not practical. I use "->X" syntax, instead of IDs to make it easier to following the referential links:
T_Person Name, Age ->john, ->28
T_CompositeSymbol AtmSym1, AtmSym2, AtmSym3, ... ->j, ->o, ->h, ->n ->2, ->8
T_AtomicSymbol 1 2 a b c ...
Mikito Harakiri - 04 Mar 2004 23:08 GMT > > I don't know what you mean by normalize > [quoted text clipped - 3 lines] > sentence is a duplicate of "john" in the first sentence. The second > sentence becomes "->john obeys army". Likewise,
"Neo has funnyNormalizationInterpretation" "Neo has watchedTooMuchMatrix" "Neo has workingToBecome#1Troll"
becomes
"Neo has funnyNormalizationInterpretation" "->Neo ->has watchedTooMuchMatrix" "->->Neo ->->has workingToBecome#1Troll"
??
That is really mind boggling, ya know.
Neo - 05 Mar 2004 19:32 GMT > "Neo has funnyNormalizationInterpretation" > "Neo has watchedTooMuchMatrix" [quoted text clipped - 5 lines] > "->Neo ->has watchedTooMuchMatrix" > "->->Neo ->->has workingToBecome#1Troll" The last line should be "->Neo ->has workingToBecome#1Troll"
Neo - 04 Mar 2004 22:52 GMT > and why not go down to the 1's & 0's The issue is not which set of composite symbols are most "normalized" (five vs 5 vs 1001) when representing a thing, but that a composite symbol can be normalized. For example, the symbols in "1001" can be normalized to "->1, ->0, ->0, ->1".
>what does symbol "normalization" gain you? Hypothethically, suppose, Microsoft says, inorder to include symbols from the neighboring galaxy we want to change a's ascii value from 97 to 1.56E03. In XDb1, this change would only occur at one place, all other a's will automatically be correct since they reference the original a.
Practically, normalizing symbols allows TDM/XDb1 to find things (by their name) within the database quickly no matter where they are located. In RDM, to implement a general solution to find a composite symbol (ie john), one would have to search every table, every row, every column and the algorithm needs to handle tables added in the future.
In TDM/XDb1, because symbols are normalized, the general solution does not require a scan of the entire db and the algorithm is unaffected by future "tables".
Would someone like to compare the time to find things by their composite symbols (ie john) using a general solution? By general, I mean, it is able to find john in any field of any table including tables not in existance during design-time.
Mike Sherrill - 06 Mar 2004 12:27 GMT >Practically, normalizing symbols allows TDM/XDb1 to find things (by >their name) within the database quickly no matter where they are >located. Hmmm.
Are you under the impression that I spend time searching all my tables for "john", only to find "john" stored as my ZIP code?
 Signature Mike Sherrill Information Management Systems
Neo - 06 Mar 2004 20:37 GMT > >Practically, normalizing symbols allows TDM/XDb1 to find things (by > >their name) within the database quickly no matter where they are > >located. > > Are you under the impression that I spend time searching all my tables > for "john", only to find "john" stored as my ZIP code? I am under the impression that you are assuming one already knows which tables exist and which table to search for john and zip code.
In XDb1, the querying "X's Y is?" (ie "john smith's zip code is?") finds the appropriate answer even if john was "tabled" in bi-ped, human, animal or all the previous. XDb1 also finds things simply by querying "X?". X could be the name of a person, car, book, state, city, SS#, part#, etc.
When person1 asks person2 about thing3, person1 doesn't typically specify what "table" to search in, yet person2 is able to retrieve it from his brain. By normalizing down to atomic symbols, XDb1 attempts to emulate that type of capability.
Marshall Spight - 08 Mar 2004 08:15 GMT > I am under the impression that you are assuming one already knows > which tables exist and which table to search for john and zip code. Ah, there's that meme: the "I don't want to have to know a schema" meme.
But there's always a schema, and you always have to know it. The format of the schema might change, but there is always a schema.
Marshall
Neo - 08 Mar 2004 18:39 GMT > > I am under the impression that you are assuming one already knows > > which tables exist and which table to search for john and zip code. > > But there's always a schema, and you always have to know it. > The format of the schema might change, but there is always a schema. We agree that there is always a schema and ultimately the system, not necessarily the user, needs to know it at run-time. With typical RDM implementations, this is more difficult to accomplish in some situations because in order to access data within a table, one generally has to know the name of the table ahead of time. With TDM/XDb1, while data is also classified/typed, an alternate method of accessing the data is simply via the symbols that compose the data. Once the data is accessed, TDM/XDb1 allows the related class(es)/type(es), attributes, etc to be determined.
Mikito Harakiri - 08 Mar 2004 18:48 GMT > With TDM/XDb1... Apparently you changed the name to "XDb1". Neo, I was kidding about the lawsuit. Until competitors take your stuff seriously you have little to worry about. And everybody on this group would guarantee you that would never happen.
Marshall Spight - 08 Mar 2004 20:57 GMT > > > I am under the impression that you are assuming one already knows > > > which tables exist and which table to search for john and zip code. [quoted text clipped - 4 lines] > We agree that there is always a schema and ultimately the system, not > necessarily the user, needs to know it at run-time. The user has to know the schema, too. The schema is what the data *means.* If the user doesn't know what the data means, the user can't use the system.
The system also has to know the schema in order to enforce integrity.
Marshall
Neo - 09 Mar 2004 17:14 GMT > > We agree that there is always a schema and ultimately the system, not > > necessarily the user, needs to know it at run-time. > > The schema is what the data *means.* First your definition does not match others. Per dictionary, a schema is an underlying organization pattern or scheme. Looking thru several db books, I could not find a stardardized definition (like that for relation) of what a schema is. Date's book don't mention it in the index. Another book, defines several types of schemas, but never schema itself. A third book, distinguishes database schema (aka meta-data) as the description of the database as opposed to the data itself. They say schema is specified during design phase and is not expected to change frequently. IMO, the last definition seems the most appropriate with respect to RDM and contrary to yours given above.
> The user has to know the schema, too. If the user doesn't know > what the data means, the user can't use the system. Using the last definition above, the level to which the user "has to know the schema" is dependent on what he is trying to do, the db's design, the code which interfaces user to db, etc. For somethings, user may not need to know anything about a db's schema. At the other extreme, user may need to know nearly every detail of a db's schema.
In TDM/XDb1, there is no meta-data about the data in the db. Data added to the db doesn't have to conform to any design-time "schema" but the added data itself defines the current "schema".
Marshall Spight - 10 Mar 2004 04:23 GMT > > > We agree that there is always a schema and ultimately the system, not > > > necessarily the user, needs to know it at run-time. [quoted text clipped - 3 lines] > First your definition does not match others. Per dictionary, a schema > is an underlying organization pattern or scheme. Yes. It is this underlying organizational pattern or schema that determines what the data means.
> Looking thru several > db books, I could not find a standardized definition (like that for > relation) of what a schema is. That's a shame, don't you think? It seems we ought to at least be able to have a common vocabulary, even if we are disagreeing. It's impossible to have a meaningful conversation in the absence of common definitions.
Come to think of it, this paragraph is self-referential. A definition is the semantics for a word; a schema is the semantics for a database.
> Date's book don't mention it in the > index. Another book, defines several types of schemas, but never > schema itself. A third book, distinguishes database schema (aka > meta-data) as the description of the database as opposed to the data > itself. That's the best one so far.
> They say schema is specified during design phase and is not > expected to change frequently. IMO, the last definition seems the most > appropriate with respect to RDM and contrary to yours given above. (That's less a definition and more of a partial functional description.) Nothing contradictory that I can see.
> > The user has to know the schema, too. If the user doesn't know > > what the data means, the user can't use the system. [quoted text clipped - 4 lines] > user may not need to know anything about a db's schema. At the other > extreme, user may need to know nearly every detail of a db's schema. These two statements:
1) > In TDM/XDb1, there is no meta-data about the data in the db.
2) > Data added to the db doesn't have to conform to any design-time
> "schema" but the added data itself defines the current "schema". contradict each other. I believe the second one, but I don't believe the first one.
Marshall
Neo - 10 Mar 2004 19:43 GMT > > These two statements contradict each other: > 1) In TDM/XDb1, there is no meta-data about the data in the db. > 2) Data added to the db doesn't have to conform to any design-time > "schema" but the added data itself defines the current "schema". In TDM/XDb1, there is no meta-data about the data as in RDM. For instance in RDM sytem/hidden table(s) indicate: a) what tables exist in db. b) what attributes exist in each table. d) the type of each attriblute. c) which key of tableX is related to which key of tableY. There is meta-data like the above in TDM/XDb1.
In TDM/XDb1 there are no stored "schemas", but a "schema" of any level of generalization can be derived from the current data. Unlike RDM, TDM only assumes the following about future data: new things will be related to existing things.
Neo - 10 Mar 2004 19:51 GMT > > > The schema is what the data *means.* > > [quoted text clipped - 3 lines] > Yes. It is this underlying organizational pattern or schema that > determines what the data means. In TDM/XDb1, it is nearly the reverse. No schema determines what data means. Instead, the current data can be used to derive schemas of any level of generalization.
Chris Hoess - 04 Mar 2004 05:03 GMT >> You're joking, right? Please tell me you're joking. > > No, not joking. See www.xdb1.com/Basic/Symbol.asp, > www.xdb1.com/GUI/Labels.asp and www.xdb1.com/HowTo/Find.asp I can immediately see two problems with this, one big and one small. Dawn has already pointed out the "small" one, which is that your concept of atomicity is not well grounded. I surmise, based on the limited information you provided, that you consider, say, a single ASCII character to be "atomic". However, since you can represent any character with a bit sequence, there's no reason your supposedly "atomic" symbols can't be broken down into bits, leaving you with the atoms 0 and 1 only (or FALSE and TRUE if you prefer).
The big problem is that your description of normalization as "The process of replacing duplicates [sic] things with a reference to the original thing" really isn't quite correct. ("The difference between the almost right word and the right word is really a large matter -- 'tis the difference between the lightning-bug and the lightning."--Mark Twain) Normalization removes relationships inadvertantly implied between pieces of data by the design of the database. It doesn't create references, and it doesn't decompose data types.
Since you seem to have disagreed with Marshall before about this, perhaps you could provide an example of the "update anomaly" that decomposing "bob" into its individual characters is supposed to prevent?
>> > The exact method of normalization and to what extent is practical is >> > dependent on the data model and its implementation. [quoted text clipped - 4 lines] > to atomic symbols (a, b, c ...) where as a similar level of > normalization in RDM is impractical? Because this process can only be called "normalization" through a vigorous application of the imagination. (And speaking of impracticality, what is it that led you to declare that a database consisting of many "two-columned" tables is impractical?)
 Signature Chris Hoess
Neo - 04 Mar 2004 20:20 GMT > I can immediately see two problems with this, one big and one small. Dawn > has already pointed out the "small" one, which is that your concept of > atomicity is not well grounded. It may not be well grounded in your understanding or within RDM, but this not the case in TDM/XDb1.
> I surmise, based on the limited information you provided, that you consider, > say, a single ASCII character to be "atomic". Yes. Also see Susanne Langer's book "Symbolic Logic".
> However, since you can represent any character with a bit sequence... The bit sequence (ie 10101) can only be expressed with symbols in this case a combination of atomic symbols 0's and 1's. What symbols and combination of symbols means is a different issue.
Neo - 04 Mar 2004 20:41 GMT > The big problem is that your description of normalization as "The process of > replacing duplicates [sic] things with a reference to the original thing" > really isn't quite correct. Within the context of TDM/XDb1, it is correct. Could you prove otherwise?
> Normalization removes relationships inadvertantly implied between pieces > of data by the design of the database. It doesn't create references, > and it doesn't decompose data types. Such may or may not be the case in your understanding or RDM. In TDM/XDb1, normalization is the process of replacing duplicate things with references to the original thing. It is similar in RDM. Suppose you start with
T_ItemColor MyCar Blue YourCar Blue
To normalize
T_ItemColor MyCar ->Blue YourCar ->Blue
T_Color Blue
In effect, one has replaced duplicate things with a reference to the orignal thing. Since all things in a row need to be of the same type, this required us to move Blue to a new table. In RDM, the mechanism of a reference is called a key or ID.
> Since you seem to have disagreed with Marshall before about this, I don't believe Marshall disagreed with TDM/XDb1's definition of normalization as the basic principle is similar to that in RDM. He was simply asking repeatedly as he didn't know my understanding of normalization.
Neo - 05 Mar 2004 18:40 GMT > ...Since all things in a row need to be of the same type... Sorry, I meant all things in a COLUMN need to be of the same type
Neo - 04 Mar 2004 21:08 GMT > > Then how do you explain that in TDM/XDb1, things are normalized down > > to atomic symbols (a, b, c ...) where as a similar level of > > normalization in RDM is impractical? > > Because this process can only be called "normalization" through a vigorous > application of the imagination. Or a failure to understand that atomic symbols (ie ^, *, &, 0, 1, 2, a, b, c,...) are infact atomic. And also a failure to apply normalization (process of replacing duplicates with ref to original) until it is no longer possible.
> (And speaking of impracticality, what is it that led you to declare that > a database consisting of many "two-columned" tables is impractical?) Having joined the thread near its 200th posting, your understanding is slightly off context. I do not claim a db consisting of 2-col tableS is impractical [insert Mark Twain].
I claimed that, for some applications, in order to avoid NULLs, one would have to resort to generic modelling, which in the extreme case results in a db with just one 2-col table, and is impractical. If one implements www.xdb1.com/Example/Ex076.asp they should see that pattern emerging. After 223 postings, still no one has. Will you be the first?
Tony - 04 Mar 2004 11:05 GMT > > You're joking, right? > [quoted text clipped - 14 lines] > to atomic symbols (a, b, c ...) where as a similar level of > normalization in RDM is impractical? Perhaps because TDM/XDb1 was created by someone who didn't have a firm grasp on normalisation, or perhaps even reality? ;)
Neo - 04 Mar 2004 19:33 GMT > Perhaps because TDM/XDb1 was created by someone who didn't have a firm > grasp on normalisation, or perhaps even reality? ;) Perhaps, but could you devise an experiment related to representing things, normalization and databases that would prove your conjecure?
Tony - 05 Mar 2004 12:15 GMT > > Perhaps because TDM/XDb1 was created by someone who didn't have a firm > > grasp on normalisation, or perhaps even reality? ;) > > Perhaps, but could you devise an experiment related to representing > things, normalization and databases that would prove your conjecure? No, I think we have to invoke Date's Incoherence Principle here: It is not possible to treat coherently that which is incoherent.
Eric Kaun - 05 Mar 2004 13:37 GMT > > > Perhaps because TDM/XDb1 was created by someone who didn't have a firm > > > grasp on normalisation, or perhaps even reality? ;) [quoted text clipped - 4 lines] > No, I think we have to invoke Date's Incoherence Principle here: It is > not possible to treat coherently that which is incoherent. Agreed.
Neo - 25 Feb 2004 20:01 GMT > > Will you implement an equivalent NULL-less solution to the example > > posted at www.xdb1.com/Example/Ex076.asp so that we can establish [quoted text clipped - 3 lines] > I reserve the right to change my design later. > No nulls were used. It is true that no NULLs were used but you have ignored something even more basic. Could you normalize the data? (XDb1's solution is normalized down to each symbol, but this is impractical with RDM).
> I have to admit your test data made me feel a bit silly. The silliness should become more apparent as one normalizes the data using RDM.
Marshall Spight - 26 Feb 2004 03:52 GMT > > > Will you implement an equivalent NULL-less solution to the example > > > posted at www.xdb1.com/Example/Ex076.asp so that we can establish [quoted text clipped - 6 lines] > It is true that no NULLs were used but you have ignored something even > more basic. Could you normalize the data? In what way is the data not already fully normalized? Perhaps you can provide me with the functional dependencies, so that I might apply the rules of normalization to the schema.
Marshall
Neo - 26 Feb 2004 15:19 GMT > > It is true that no NULLs were used but you have ignored something even > > more basic. Could you normalize the data? > > In what way is the data not already fully normalized? Perhaps you > can provide me with the functional dependencies, so that I might > apply the rules of normalization to the schema. Everything in TDM/XDb1 is functionally dependent on atomic symbols. Normalize down to atomic symbols (a,b,c ...) if you can. Or just stop when it becomes impractical to normalize any further in RDM.
Eric Kaun - 26 Feb 2004 18:16 GMT > (XDb1's solution is > normalized down to each symbol, but this is impractical with RDM). Normalized down to each symbol? That makes absolutely no sense at all. How is that different from no normalization at all?
Neo - 27 Feb 2004 03:51 GMT > > XDb1's solution is normalized down to each symbol, > > but this is impractical with RDM. > > Normalized down to each symbol? That makes absolutely no sense at all. Shown below is roughly how to do it in RDM, I use "->X" syntax, instead of IDs to make it easier to following the referential links.
T_Person Name, Age ->john, ->28
T_CompositeSymbol AtmSym1, AtmSym2, AtmSym3, ... ->j, ->o, ->h, ->n ->j, ->o, ->e ->2, ->8
T_AtomicSymbol a b c ...
for more info, see www.xdb1.com/Basic/Symbol.asp and www.xdb1.com/Example/Ex002.asp
> How is that different from no normalization at all? It is exactly the opposite of no normalization. It is near complete normalization.
Neo - 25 Feb 2004 06:36 GMT > > And when it doesn't, RDM needs NULLs, as Codd has correctly recognized. > > Codd believed that incorrectly, as Date implies. Date is correct in that there is something inherently wrong with a model that allows NULLs.
Codd is correct in that NULLs are an integral part of RDM.
NULLs are integral part of RDM because RDM is slightly flawed.
When one substitutes "Not applicable" for NULL, the flaw is only partially masked. NULLs kill closure. How does substituting "Not applicable" for NULL significantly enhance closure?
Eric Kaun - 26 Feb 2004 18:19 GMT > Date is correct in that there is something inherently wrong with a > model that allows NULLs. Right.
> Codd is correct in that NULLs are an integral part of RDM. No, he's not right, and Date showed that. Instead of just repeating "Chapter 20: Missing Information" in Date's book, can you actually show me a section or sentence or whatever that indicates that NULL is an integral part?
Codd invented relational. He stumbled in the implications of his ideas later. He was wrong. Date showed that in his book, in the same chapter you keep mentioning. What's not clear?
> When one substitutes "Not applicable" for NULL, the flaw is only > partially masked. NULLs kill closure. How does substituting "Not > applicable" for NULL significantly enhance closure? NULLs mask far more: they mask meaning. Does NULL mean don't know, don't care, something-but-I-don't-know-what, etc?
You're right - my "NOT APPLICABLE" was a poor choice. The "no eyes" example, however, was a stretch. If the attribute truly isn't applicable, then you have the wrong predicate.
However, "UNKNOWN" is valid. It's a specific meaning, that NULL loses. It's useful in queries in many ways that NULL isn't.
Neo - 27 Feb 2004 04:32 GMT > > Codd is correct in that NULLs are an integral part of RDM. > > No, he's not right, and Date showed that. Unless Date was talking about a db with one two-column (one of which one is auto supplied) table, Date is wrong.
> can you actually show me a section or sentence or whatever that > indicates that NULL is an integral part? Only the one in Date's books on pg 123 of his 6th Ed. "Codd now regards nulls as an integral part of the relational model"
> |
|